I'd also look at what TAPoR is up to. I don't know too much about specific software, but I believe this is the kind of thing they are interested in.
-dan
On Tue, 2007-08-14 at 19:58 +0100, James Cummings wrote:
James Ginther wrote:
Does anyone know of software that will aid in the comparison of two or more texs in order to determine specific use of sources? Ideally, the software would recognize a base text and then compare it to other texts that are potential sources.
I know of software applications like Collate that does some basic comparative work in order to create an apparatus criticus for critical editions, or the "authorship" testing software that Peter Miliken at Leeds developed using stylometrics (which only outputs statistical results and no textual output). But I am looking for software that would compare units of texts (such as sentence to sentence) and one--here's the kicker--that might also work with an inflected language like Latin (so that it would include returns that differ only in terms of case ending). I know that's a tall order, and I have my doubts that something like this exists--but I thought someone on this list might know of some code kicking around in somebody's virtual basement.
Hi Jim,
Although they aren't really intended to do what you want (and I'm interested in other answers you might receive), have you considered some of the corpus linguistic analysis software out there? It would involve having all the sources, of course, in the corpus but then indicating that one text uses a particular text more than another seems fairly straightforward. However, as with all such things having the texts with detailed descriptive markup would be even more beneficial. I can't think of anything which would do exactly what you want out-of-the-box.
-James