As would I: thank you!
-Jonathan Herold Centre for Medieval Studies University of Toronto ----- Original Message ----- From: "Daniel O'Donnell" daniel.odonnell@uleth.ca To: "David M. Mimno" David.Mimno@tufts.edu Cc: "Digital Medievalist Community mailing list" dm-l@uleth.ca Sent: Friday, July 02, 2004 10:12 AM Subject: Re: [dm-l] Experience with Full Text Searching
I would be; thanks. -dan
David M. Mimno wrote:
[sorry if this is redundant -- I joined this list and then went away for two weeks so I may have missed a reply]
Depending on your tolerance for Java programming, you might want to look at Lucene (http://jakarta.apache.org/lucene). It's a VERY fast, very reliable open source search engine. We've had great success running it over the Perseus Greek and Latin collections.
Relative to eXist, Lucene is more of an API than a full-fledged
application
with administrative interfaces and so forth. That means there's a bit of programming overhead, but it also makes it easy to add specialized linguistic information to the search process, like morphological query expansion (the query "amo" is translated to "amo amas amat...").
Something
similar could be done for orthographic variants, if you had a database of equivalent spellings.
If anyone is interested, I can provide more detail (and code).
David Mimno Perseus Project
Quoting Daniel O'Donnell daniel.odonnell@uleth.ca:
Hello all, I have another question for the collective wisdom. What are people's experience with full text searching. We are building an xml-ised database of original journal articles. These will be retrievable using metadata tags like issue, author, key words, subjects, etc. The problem is that building full text retrieval capabilities into the same search engine will ultimately affect display times. I guess I have two
questions:
a) is full text searching of academic articles a sine qua non? I used to think everybody did it, but I've been looking a bit more and see it is less usual than I thought in the case of larger libraries. b) have people suggestions for incorporating full text searching into a document database. -dan
-- Daniel Paul O'Donnell, PhD Associate Professor of English University of Lethbridge Lethbridge AB T1K 3M4 Tel. (403) 329-2377 Fax. (403) 382-7191 E-mail daniel.odonnell@uleth.ca Home Page http://people.uleth.ca/~daniel.odonnell/
dm-l mailing list dm-l@uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l