dm-l June 2005

dm-l@uleth.ca

24 participants
18 discussions

Re: [dm-l] Concordancing Queries
by Laurie Ringer 16 Jun '05

16 Jun '05

>>> Abdullah.Alger-2(a)postgrad.manchester.ac.uk 06/16/05 11:42 AM Compared to all of the other concordancing tools I think that Watt's is the simplest to use. Also, what's great about it is that it can handle characters such as <thorn> and <eth>. A nice feature in the program is that you can save the results as an html document, but the drawback is that you cannot save it in xml or any other format except as text. >>>Are there any concordance programs that allow you to convert to xml? Abdullah Alger There are several RTF to XML conversion applications online including: RTF2F0 XML Converter http://www.rtf2fo.com/features.html RTF to XML http://www.rtf-to-xml.com/features.html RTF to XML 5.2.1 http://www.programmersheaven.com/zone16/cat290/35143.htm I must admit I have not used these, so I cannot vouch for their effectiveness (or lack thereof); however, opening a text file (from Watt's Concordance) in Word, saving as RTF, then using a converter to create an XML file might work. ---Laurie Laurie Ringer Assistant Professor of English Canadian University College Lacombe, AB T4L 2E5 (phone) 403.782.3381, ext. 4085 (fax) 403.782.0735 Quoting Godfried Croenen <g.croenen(a)liverpool.ac.uk>: > Hi Laurie, > > I am still using TACT 1.2 when I need a concordance and I am happy to > answer TACT queries if I can. > > I was not aware of R.J.C. Watt's programme Concordance, although I had a > look at the website and will try it out. But I can see already a number of > difficult problems with my corpus, as I have often encoded page breaks or > line breaks in the middle of words, which the programme apparently cannot > handle. > > I have also used WordSmith tools and find it useful, although it is a > different kind of programme, mainly aimed for doing corpus linguistics and > hence not that good in formating and referencing the texts sections. > > Maybe you should also try out the TAPOR tool at > <http://taporware.mcmaster.ca/> > > Best, > > Godfried > > > --On 15 June 2005 13:25 -0600 Laurie Ringer <lringer(a)CAUC.CA> wrote: > >> I am producing a concordance of the English vernacular texts that >> scholarship allows as Wycliffite or Lollard in persuasion. I would like >> to add more texts, and am attempting to work out a few issues on which I >> wondered if anyone might have advice. >> >> For information I have recently been using R.J.C. Watt's programme >> Concordance (http://www.concordancesoftware.co.uk/); however, due to a >> significant problem with hyphenated words---Watt's Help file >> specifically states that it does not treat hyphenated words, which are >> divided between 2 lines, as single words---I am thinking of switching >> back to TACT or to another programme. >> >> It's been some years since I used TACT. Is anyone fluent in TACT and >> willing to field the odd question or two which Ian Lancashire's book >> Using TACT with Electronic Texts does not answer? Or, alternatively, can >> anyone recommend a better programme? >> >> Line numbering: Aside from keying in line numbers by hand (which I have >> been doing), is there a macro or application that can automate the line >> numbering process in large numbers of texts in Word or Word Pad? >> >> Page numbering: As above, is there a macro or application that can >> automate the page numbering process in large numbers of texts? NB: the >> end of the printed page in electronic format rarely corresponds with the >> end of a Word or Word Pad page. >> >> Many thanks for any suggestions anyone might be able to make. >> ---Laurie >> >> Laurie Ringer >> Assistant Professor of English >> Canadian University College >> Lacombe, AB T4L 2E5 >> (phone) 403.782.3381, ext. 4085 >> (fax) 403.782.0735 >> >> _______________________________________________ >> Digital Medievalist Project >> Homepage: http://www.digitalmedievalist.org >> Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm >> RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cfm >> Wiki: http://sql.uleth.ca/dmorgwiki/index.php >> Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l >> Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm >> Contact editorial Board: digitalmedievalist(a)uleth.ca >> dm-l mailing list >> dm-l(a)uleth.ca >> http://listserv.uleth.ca/mailman/listinfo/dm-l > > > > ---------------------- > Dr. Godfried Croenen > School of Modern Languages, French Section > University of Liverpool > Chatham Street > Liverpool > L69 7ZR > > Tel: +44 (0)151 794 2763 > Fax: +44 (0)151 794 2357 > e-mail: G.Croenen(a)Liverpool.ac.uk > > > _______________________________________________ > Digital Medievalist Project > Homepage: http://www.digitalmedievalist.org > Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm > RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cfm > Wiki: http://sql.uleth.ca/dmorgwiki/index.php > Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l > Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm > Contact editorial Board: digitalmedievalist(a)uleth.ca > dm-l mailing list > dm-l(a)uleth.ca > http://listserv.uleth.ca/mailman/listinfo/dm-l > _______________________________________________ Digital Medievalist Project Homepage: http://www.digitalmedievalist.org Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cfm Wiki: http://sql.uleth.ca/dmorgwiki/index.php Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm Contact editorial Board: digitalmedievalist(a)uleth.ca dm-l mailing list dm-l(a)uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l

1 0

Searching Latin texts and orthographical variants
by James R. Ginther 16 Jun '05

16 Jun '05

I am the project director of the Electronic Grosseteste, a research resource that provides access to electronic medieval Latin texts and an integrated bibliography. The textbase is composed of a variety of Latin texts (most of them under copyright but still searchable). Right now the search engine is pretty primitive, and one enhancement I would like to make is to account for orthographical variants in the texts. Some texts were classicized, while other editors followed either the orthography of a single manuscript or attempted to follow some sort of convention based generally on Latin texts in later medieval England (these are the facts, and this post is not about the joy of debating editorial practice). Ideally, I would like to allow searches to include returns for classical and "medieval" spellings. For example, if a user queried "scientia" the engine would return matches for "scientia" and "sciencia". (wildcards are permitted, btw). Now I work in Perl5, and so my initial thought was to create a set of hash tables that would map these variants since hashes would allow for more than one variant per entity, and the engine would then perform a lookup for each query element. Now I suppose coding into the engine the "orthographical rules" is another option, but I'll be honest and admit that computational linguistics has never been my thing. And, the beauty of hashes in Perl is that they are compiled very quickly, and don't eat too much memory. Now before I go and reinvent the wheel with these hash tables, does anyone know of an open-source method or resource that addresses this kind of problem (I know that Brepols--pardon me, Brepolis...yeesh---has this all figured out but they don't play will with others, so that's a closed door.). My limited scouring of the web has yielded no joy, and so I seek the sage advice of this community. Many thanks Jim -------------------- Dr James R. Ginther, PhD Assoc. Professor of Medieval Theology & Director of Graduate Studies Dept of Theological Studies St Louis University ginthej(a)slu.edu --------------------------------- dept: http://theology.slu.edu/ research: http://www.grosseteste.com/

2 1

Posting tips and sites
by Daniel Paul O'Donnell 10 Jun '05

10 Jun '05

I use subscriptions as an informal topic popularity meter: most postings on dm-l result in an average of 5 new subscriptions and 3 unsubscriptions. Postings on XSLT result in 5 unsubscriptions and 3 subscriptions ;). Interestingly, posting the unicode website resulted in no loss and some gain. Perhaps a gauge of usefulness? -dan -- Daniel Paul O'Donnell Associate Professor of English Director, Digital Medievalist Project University of Lethbridge Lethbridge AB T1K 3M4 Vox: +1 403 381-2539 Fax: +1 403 382-7191 URL: http://people.uleth.ca/~daniel.odonnell/ Digital Medievalist Project: http://www.digitalmedievalist.org/

1 0

Re: Re: [dm-l] Letter database: languages, character sets, names etc
by Dorothy C. Porter 10 Jun '05

10 Jun '05

I find it useful - and I was unaware of this site. I usually use the Unicode charts, but that can be tedious (since there are now five charts for the Latin alphabet). Thanks, Dan, for a great bookmark! Dot -----Original Message----- From: James Cummings <James.Cummings(a)computing-services.oxford.ac.uk> To: Digital Medievalist Community mailing list <dm-l(a)uleth.ca> Date: Fri, 10 Jun 2005 10:12:52 +0100 Subject: Re: [dm-l] Letter database: languages, character sets, names etc Daniel Paul O'Donnell wrote: > I'm not sure if members of this list would find this type of e-mail > useful (please let me know if you do... or don't), but here goes: I find it useful. > > A common problem in text encoding is locating the correct codes for > "unusual letters". There are various utilities for doing this in > windows, mac, and Linux. But here is a useful web-based utility.You can > use it to look up character names and find their code point (though you > do have to be fairly precise), and it will produce the correct number in > hex and decimal formats. It will also tell you everything you ever > wanted to know about characters required for encoding Estonian. > > http://www.eki.ee/letter/ Well, ok, I actually I knew about this particular site. I've used that and of course there is the unicode site itself, especially the charts page. Also, most linux distributions contain a graphical character-map utitilty that is searchable. On of the things out of unicode recently is their report: http://www.unicode.org/reports/tr22/ on CharMapML = Character Mapping Markup Language. Readers might also be interested in drafts of: TEI P5 Draft Chapter 4: Language and Character Sets: http://www.tei-c.org/P5/Guidelines/CH.html and TEI P5 Draft Chapter 25: Representation of non-standard characters and glyphs http://www.tei-c.org/P5/Guidelines/WD.html Just thought I'd add that in to Dan's comment. -James -- Dr James Cummings, Oxford Text Archive, University of Oxford James dot Cummings at oucs dot ox dot ac dot uk _______________________________________________ Digital Medievalist Project Homepage: http://www.digitalmedievalist.org Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cfm Wiki: http://sql.uleth.ca/dmorgwiki/index.php Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm Contact editorial Board: digitalmedievalist(a)uleth.ca dm-l mailing list dm-l(a)uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l *************************************** Dorothy Carr Porter, Program Coordinator Collaboratory for Research in Computing for Humanities University of Kentucky 351 William T. Young Library Lexington, KY 40506 dporter(a)uky.edu 859-257-9549 ***************************************

3 2

Letter database: languages, character sets, names etc
by Daniel Paul O'Donnell 10 Jun '05

10 Jun '05

I'm not sure if members of this list would find this type of e-mail useful (please let me know if you do... or don't), but here goes: A common problem in text encoding is locating the correct codes for "unusual letters". There are various utilities for doing this in windows, mac, and Linux. But here is a useful web-based utility.You can use it to look up character names and find their code point (though you do have to be fairly precise), and it will produce the correct number in hex and decimal formats. It will also tell you everything you ever wanted to know about characters required for encoding Estonian. http://www.eki.ee/letter/ See also the Digital Medievalist Wiki entry for character encoding http://sql.uleth.ca/dmorgwiki/index.php/Fonts (to which I have just added information about this site). -dan

2 1

[Fwd: Fwd: Re: timelines]
by Daniel Paul O'Donnell 08 Jun '05

08 Jun '05

Forwarded from Medtext-l. Does anybody know an answer?

2 1

Re: [dm-l] [Fwd: Fwd: Re: timelines]
by Dorothy C. Porter 08 Jun '05

08 Jun '05

I sent this message to Medtext-l this morning, but it's of interest to this list, too. I've never used HEML, but it looks neat: Take a look at the Historical Event Markup and Linking Project (http://www.heml.org/). The stated goal of HEML is "to define XML elements that expose and outline historical events asserted in documents across the web and to parse and display these elements in interesting and useful ways." It's a markup system, not software, and you'd probably have to use some XML editing software to create your HEML documents, rather than any HEML-specific software, but it does appear to enable linking to digital objects using the <Evidence> element. HEML is designed to be combined with other markup languages, so if you already have your information in some form of XML (a TEI list, for example) you can add HEML markup on top of that using the heml: namespace. HEML also requires XSLT and/or SVG for viewing. This may be more or less than what you need, but check out the example files. They're pretty cool. Dot -----Original Message----- From: "Daniel Paul O'Donnell" <daniel.odonnell(a)uleth.ca> To: dm-l(a)uleth.ca Date: Wed, 08 Jun 2005 00:06:26 -0600 Subject: [dm-l] [Fwd: Fwd: Re: timelines] Forwarded from Medtext-l. Does anybody know an answer? *************************************** Dorothy Carr Porter, Program Coordinator Collaboratory for Research in Computing for Humanities University of Kentucky 351 William T. Young Library Lexington, KY 40506 dporter(a)uky.edu 859-257-9549 ***************************************

1 0

DM editorial email address down.
by Dan O'Donnell 07 Jun '05

07 Jun '05

Hello all, The editorial e-mail address for the digitalmedievalist project and journal digitalmedievalist(a)uleth.ca has been hit by spammers who are e-mailing us at a rate (200+ messages a day) that suggests they may be trying a denial of service attack (though why they'd pick on us is beyond me). The result is that we have very likely missed any legitimate e-mail sent over the last three weeks. Missed e-mails include e-mails sent directly to us or RSS announcements submitted via our on-line form. We are trying to work out a way of reopening the address or finding another way for people to get in touch with us. In the meantime, correspondence for the digital medievalist project or journal should be addresses to me personally: daniel.odonnell the-funny-little-symbol-above-the-2-on-US-keyboards uleth.ca ;) Sorry for any inconvenience. -dan -- -- Daniel Paul O'Donnell, PhD Department of English University of Lethbridge Lethbridge Alberta T1K 3M4 Canada Tel: +1 (403) 329-2377 Fax: +1 (403) 382-7191 e-mail: daniel.odonnell(a)uleth.ca Web-Page: http://home.uleth.ca/~daniel.odonnell The Electronic Caedmon's Hymn: http://home.uleth.ca/~caedmon

1 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

dm-l June 2005