Peter Robinson wrote:
The Digital Medievalist List (see end of message for contact information and project URLs).
Peter (Binkley) has already pointed out the problems with the XSLT approach he took. I think even the fix he suggests, of returning a stripped node fragment, would still lead to performance problems (for instance..to pull a single page say out of the Hengwrt Chaucer you would need every time to read the WHOLE document to create the node sets for ALL pages, just so you could extract the node set for just one, every time).
Is this necessarily the case? Couldn't you simply output these fragments as individual page files - so only having to do the process once? I agree with you whole-heartedly that XSLT is problematic for choosing to display from one arbitrary point to another arbitrary point in the XML, but your points aren't arbitrary. Page breaks will be the same each time, so I don't see the need to do this dynamically. Store your full versions of Hengwrt, and also store files created for each page, searching the former, but displaying the latter. Just because we *can* dynamically pull out a single page from the whole of a large XML file doesn't mean we *have* to.
I don't have a prejudice against XSLT, which seems a fine tool for doing most of the things one wants to do with XML. But that does not mean it can do *everything* equally easily (which is how this whole thread started, with musings on its limits as a typesetting language). Some things certainly Anastasia does much easier, and that is no surprise -- it was designed just for that. She is open source. Go look at her on http://anastasia.sourceforge.net
I'd of course encourage people to do so as well, Anastasia is a great tool and it is wonderful that it has been made open source.