Dear fellow medievalists:
I am writing to ask for your help. Over the last year, the Center for Digital Theology at Saint Louis University has been prototyping a transcription tool for digitized manuscripts. Part of that process includes an application that identifies the location of the lines on a given manuscript page. We have been testing this on a variety of limited datasets, but for the last two months we have focused our testing on the complete collection of digitized manuscripts that comprise the Parker on the Web collection.
Part of the development has included spot checking the results, where a user will display individual images and check the attempted line parsing. There are some 196,000 images in the Parker Collection and so it has not been possible to check every one. While this is a substantial number, there are only 500 manuscripts in the collection. To ensure we have evaluated a number of pages from each manuscript, we have gone through four iterations during this grant period already and our small team has been able to check as many as 2,000 images in one iteration.
We would now like to open this up to the larger public for a limited period of time. We are asking you to go to http://manuscripts.no-ip.org/Paleography_Web/. There you can create a user name and password and check as many manuscript images as you like (and you can return to your session later using that same username/password). Those images are from the Parker collection, but have had colored lines superimposed on them to indicate where our application thinks it is has found lines. You would be asked to judge the quality of the line parsing (good, close, bad, disaster, etc), based on some basic guidelines provided. Taking a cue from projects like Galaxy Zoo, there will be a leader board indicating how many images each user has evaluated. No prizes for being first, though, other than the admiration of the digital humanist community--and our thanks!
This "crowd sourcing QA" will open up to the general public on August 26, 2010 and will be closed on August 30, 2010. We do not require any personal information. Usernames will be retained to identify which user evaluated which image.
We believe that this could be a very useful tool, and the more feedback we can get at this early stage, the better the tool will be in the end.
Please contact me if you have any questions.
This project is being funded an award from the President’s Research Fund, Saint Louis University. The prototype line parser was first developed during a grant from the Andrew W. Mellon Foundation. The Center thanks both funding bodies for their support of this kind of research.
All images displayed are owned by the Matthew Parker Library, Corpus Christi College, Cambridge.
© 2009 Masters of Corpus Christi College.
Images used with permission.
No one may duplicate any image (in digital or printed format; or store on an electronic device) without the express permission of the Masters of Corpus Christi College.
James Ginther
Project Director