New subject: Searching Latin texts and orthographical variantst

16 Jun 2005


      I am the project director of the Electronic Grosseteste, a research resource that provides access to electronic medieval Latin texts and an integrated bibliography.  The textbase is composed of a variety of Latin texts (most of them under copyright but still searchable).  Right now the search engine is pretty primitive, and one enhancement I would like to make is to account for orthographical variants in the texts.  Some texts were classicized, while other editors followed either the orthography of a single manuscript or attempted to follow some sort of convention based generally on Latin texts in later medieval England (these are the facts, and this post is not about the joy of debating editorial practice).  Ideally, I would like to allow searches to include returns for classical and "medieval" spellings.  For example, if a user queried "scientia" the engine would return matches for "scientia" and "sciencia". (wildcards are permitted, btw).
Now I work in Perl5, and so my initial thought was to create a set of hash tables that would map these variants since hashes would allow for more than one variant per entity, and the engine would then perform a lookup for each query element.  Now I suppose coding into the engine the "orthographical rules" is another option, but I'll be honest and admit that computational linguistics has never been my thing.  And, the beauty of hashes in Perl is that they are compiled very quickly, and don't eat too much memory.
Now before I go and reinvent the wheel with these hash tables, does anyone know of an open-source method or resource that addresses this kind of problem (I know that Brepols--pardon me, Brepolis...yeesh---has this all figured out but they don't play will with others, so that's a closed door.).  My limited scouring of the web has yielded no joy, and so I seek the sage advice of this community.
Many thanks
Jim
--------------------
Dr James R. Ginther, PhD
Assoc. Professor of Medieval Theology
& Director of Graduate Studies
Dept of Theological Studies
St Louis University
ginthej@slu.edu
---------------------------------
dept: http://theology.slu.edu/
research: http://www.grosseteste.com/