I am producing a concordance of the English vernacular texts that scholarship allows as Wycliffite or Lollard in persuasion. I would like to add more texts, and am attempting to work out a few issues on which I wondered if anyone might have advice.
For information I have recently been using R.J.C. Watt's programme Concordance (http://www.concordancesoftware.co.uk/); however, due to a significant problem with hyphenated words---Watt's Help file specifically states that it does not treat hyphenated words, which are divided between 2 lines, as single words---I am thinking of switching back to TACT or to another programme.
It's been some years since I used TACT. Is anyone fluent in TACT and willing to field the odd question or two which Ian Lancashire's book Using TACT with Electronic Texts does not answer? Or, alternatively, can anyone recommend a better programme?
Line numbering: Aside from keying in line numbers by hand (which I have been doing), is there a macro or application that can automate the line numbering process in large numbers of texts in Word or Word Pad?
Page numbering: As above, is there a macro or application that can automate the page numbering process in large numbers of texts? NB: the end of the printed page in electronic format rarely corresponds with the end of a Word or Word Pad page.
Many thanks for any suggestions anyone might be able to make. ---Laurie
Laurie Ringer Assistant Professor of English Canadian University College Lacombe, AB T4L 2E5 (phone) 403.782.3381, ext. 4085 (fax) 403.782.0735
I've been using Watt's Concordance for months now with OE texts. For line numbers what I have been doing is creating a table in Word and guess-timating how many lines there are then I just paste the text into the table and WA LA! I have a table that separates the text into lines.
Then I copy the table and place it into Excel and then just leave a column empty for my line numbers. Then I type 1, 2, 3 and highlight the text and put the arrow on the bottom right edge of the cell that I typed 3 in and pull it all the way down until the end of the text. You will then have a text which has line numbers. If you need the lines to have a certain code like <L 123> then before you type 1, 2, 3 (above) just format the cell and click 'custom' and type <L #>, click OK and then you just have to do the steps above.
Abdullah
Quoting Laurie Ringer lringer@CAUC.CA:
I am producing a concordance of the English vernacular texts that scholarship allows as Wycliffite or Lollard in persuasion. I would like to add more texts, and am attempting to work out a few issues on which I wondered if anyone might have advice.
For information I have recently been using R.J.C. Watt's programme Concordance (http://www.concordancesoftware.co.uk/); however, due to a significant problem with hyphenated words---Watt's Help file specifically states that it does not treat hyphenated words, which are divided between 2 lines, as single words---I am thinking of switching back to TACT or to another programme.
It's been some years since I used TACT. Is anyone fluent in TACT and willing to field the odd question or two which Ian Lancashire's book Using TACT with Electronic Texts does not answer? Or, alternatively, can anyone recommend a better programme?
Line numbering: Aside from keying in line numbers by hand (which I have been doing), is there a macro or application that can automate the line numbering process in large numbers of texts in Word or Word Pad?
Page numbering: As above, is there a macro or application that can automate the page numbering process in large numbers of texts? NB: the end of the printed page in electronic format rarely corresponds with the end of a Word or Word Pad page.
Many thanks for any suggestions anyone might be able to make. ---Laurie
Laurie Ringer Assistant Professor of English Canadian University College Lacombe, AB T4L 2E5 (phone) 403.782.3381, ext. 4085 (fax) 403.782.0735
Digital Medievalist Project Homepage: http://www.digitalmedievalist.org Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cfm Wiki: http://sql.uleth.ca/dmorgwiki/index.php Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm Contact editorial Board: digitalmedievalist@uleth.ca dm-l mailing list dm-l@uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l
I transcribed Hall's Chronicle (1550) into tables in Word but I only had to create the table once. When I finished a page, I copied it, pasted it below, then highlighted and deleted the text, leaving a blank table. A two-row, multi-column data table at the top of each transcribed page had details on the page, work notes, etc.
Later on, however, when I contemplated using Watt's Concordance, I realized that line numbers wouldn't be adequate by themselves, so I used some macros to add the page numbers. If I were doing running text with continuous line numbers, I'd get Excel to do it automatically, using the Edit / Fill / Series (Column, Linear, 1 to whatever) function, then copy and paste into a Word table.
I faithfully hyphenated as Hall's compositors did but I wish I hadn't, for it hinders searching for those words and would make a Watt-created concordance unreliable. I may spin off one transcription version with hyphenation and create another without broken words.
Cheers, Al Magary
Laurie Ringer wrote:
It's been some years since I used TACT. Is anyone fluent in TACT and willing to field the odd question or two which Ian Lancashire's book Using TACT with Electronic Texts does not answer? Or, alternatively, can anyone recommend a better programme?
You may be interested in Wordsmith Tools http://www.lexically.net/wordsmith/
Or, if you want to do more complicated corpus linguistics analysis then Xaira http://www.oucs.ox.ac.uk/rts/xaira/
Line numbering: Aside from keying in line numbers by hand (which I have been doing), is there a macro or application that can automate the line numbering process in large numbers of texts in Word or Word Pad?
Since I tend to work with documents structurally marked up in XML, I'd wrap an XML root element around the text document and then pipe it through an XSLT stylesheet designed to wrap <l>line elements</l> around each text containing line. Alternatively, fairly easy to do with a decent search and replace macro or perl script.
Page numbering: As above, is there a macro or application that can automate the page numbering process in large numbers of texts? NB: the end of the printed page in electronic format rarely corresponds with the end of a Word or Word Pad page.
Or alternatively, load up the word document in OpenOffice,and use Sebastian's OO->TEI XML filter which should preserve page numbers. The lines will come out as <p> paragraphs </p> but that is a quite search and replace.
Many thanks for any suggestions anyone might be able to make.
Not sure if that is helpful. A lot of people who might answer this better are here at the ACH/ALLC conference in Victoria, BC at the moment. And the weather is much too nice to be checking email!
-James
Hi Laurie,
I am still using TACT 1.2 when I need a concordance and I am happy to answer TACT queries if I can.
I was not aware of R.J.C. Watt's programme Concordance, although I had a look at the website and will try it out. But I can see already a number of difficult problems with my corpus, as I have often encoded page breaks or line breaks in the middle of words, which the programme apparently cannot handle.
I have also used WordSmith tools and find it useful, although it is a different kind of programme, mainly aimed for doing corpus linguistics and hence not that good in formating and referencing the texts sections.
Maybe you should also try out the TAPOR tool at http://taporware.mcmaster.ca/
Best,
Godfried
--On 15 June 2005 13:25 -0600 Laurie Ringer lringer@CAUC.CA wrote:
I am producing a concordance of the English vernacular texts that scholarship allows as Wycliffite or Lollard in persuasion. I would like to add more texts, and am attempting to work out a few issues on which I wondered if anyone might have advice.
For information I have recently been using R.J.C. Watt's programme Concordance (http://www.concordancesoftware.co.uk/); however, due to a significant problem with hyphenated words---Watt's Help file specifically states that it does not treat hyphenated words, which are divided between 2 lines, as single words---I am thinking of switching back to TACT or to another programme.
It's been some years since I used TACT. Is anyone fluent in TACT and willing to field the odd question or two which Ian Lancashire's book Using TACT with Electronic Texts does not answer? Or, alternatively, can anyone recommend a better programme?
Line numbering: Aside from keying in line numbers by hand (which I have been doing), is there a macro or application that can automate the line numbering process in large numbers of texts in Word or Word Pad?
Page numbering: As above, is there a macro or application that can automate the page numbering process in large numbers of texts? NB: the end of the printed page in electronic format rarely corresponds with the end of a Word or Word Pad page.
Many thanks for any suggestions anyone might be able to make. ---Laurie
Laurie Ringer Assistant Professor of English Canadian University College Lacombe, AB T4L 2E5 (phone) 403.782.3381, ext. 4085 (fax) 403.782.0735
Digital Medievalist Project Homepage: http://www.digitalmedievalist.org Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cfm Wiki: http://sql.uleth.ca/dmorgwiki/index.php Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm Contact editorial Board: digitalmedievalist@uleth.ca dm-l mailing list dm-l@uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l
---------------------- Dr. Godfried Croenen School of Modern Languages, French Section University of Liverpool Chatham Street Liverpool L69 7ZR
Tel: +44 (0)151 794 2763 Fax: +44 (0)151 794 2357 e-mail: G.Croenen@Liverpool.ac.uk
Compared to all of the other concordancing tools I think that Watt's is the simplest to use. Also, what's great about it is that it can handle characters such as <thorn> and <eth>. A nice feature in the program is that you can save the results as an html document, but the drawback is that you cannot save it in xml or any other format except as text. Are there any concordance programs that allow you to convert to xml?
Abdullah Alger
Quoting Godfried Croenen g.croenen@liverpool.ac.uk:
Hi Laurie,
I am still using TACT 1.2 when I need a concordance and I am happy to answer TACT queries if I can.
I was not aware of R.J.C. Watt's programme Concordance, although I had a look at the website and will try it out. But I can see already a number of difficult problems with my corpus, as I have often encoded page breaks or line breaks in the middle of words, which the programme apparently cannot handle.
I have also used WordSmith tools and find it useful, although it is a different kind of programme, mainly aimed for doing corpus linguistics and hence not that good in formating and referencing the texts sections.
Maybe you should also try out the TAPOR tool at http://taporware.mcmaster.ca/
Best,
Godfried
--On 15 June 2005 13:25 -0600 Laurie Ringer lringer@CAUC.CA wrote:
I am producing a concordance of the English vernacular texts that scholarship allows as Wycliffite or Lollard in persuasion. I would like to add more texts, and am attempting to work out a few issues on which I wondered if anyone might have advice.
For information I have recently been using R.J.C. Watt's programme Concordance (http://www.concordancesoftware.co.uk/); however, due to a significant problem with hyphenated words---Watt's Help file specifically states that it does not treat hyphenated words, which are divided between 2 lines, as single words---I am thinking of switching back to TACT or to another programme.
It's been some years since I used TACT. Is anyone fluent in TACT and willing to field the odd question or two which Ian Lancashire's book Using TACT with Electronic Texts does not answer? Or, alternatively, can anyone recommend a better programme?
Line numbering: Aside from keying in line numbers by hand (which I have been doing), is there a macro or application that can automate the line numbering process in large numbers of texts in Word or Word Pad?
Page numbering: As above, is there a macro or application that can automate the page numbering process in large numbers of texts? NB: the end of the printed page in electronic format rarely corresponds with the end of a Word or Word Pad page.
Many thanks for any suggestions anyone might be able to make. ---Laurie
Laurie Ringer Assistant Professor of English Canadian University College Lacombe, AB T4L 2E5 (phone) 403.782.3381, ext. 4085 (fax) 403.782.0735
Digital Medievalist Project Homepage: http://www.digitalmedievalist.org Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cfm Wiki: http://sql.uleth.ca/dmorgwiki/index.php Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm Contact editorial Board: digitalmedievalist@uleth.ca dm-l mailing list dm-l@uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l
Dr. Godfried Croenen School of Modern Languages, French Section University of Liverpool Chatham Street Liverpool L69 7ZR
Tel: +44 (0)151 794 2763 Fax: +44 (0)151 794 2357 e-mail: G.Croenen@Liverpool.ac.uk
Digital Medievalist Project Homepage: http://www.digitalmedievalist.org Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cfm Wiki: http://sql.uleth.ca/dmorgwiki/index.php Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm Contact editorial Board: digitalmedievalist@uleth.ca dm-l mailing list dm-l@uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l
Abdullah Alger wrote:
Compared to all of the other concordancing tools I think that Watt's is the simplest to use. Also, what's great about it is that it can handle characters such as <thorn> and <eth>. A nice feature in the program is that you can save the results as an html document, but the drawback is that you cannot save it in xml or any other format except as text. Are there any concordance programs that allow you to convert to xml?
Abdullah Alger
FWIW, Xaira (as mentioned here earlier by James) is a true Unicode system. It will operate on XML files properly. It will index and concordance texts with a minimum of XML markup (a tag at the beginning of the file and one at the end will suffice!) and it will also handle texts with more markup than anything else (e.,g. ones -- like our version of the OE Corpus -- in which every word has an XML tag giving a lemma and a POS code for it). You can save results from it as XML files, or you can develop your own web application to interact with it at a low level. We've tested it on lots of different languages, and on small or very large corpora -- the biggest so far being the Hungarian National Corpus, which contains about 600 million words of annotated texts....
And it is an open source system. See http://xaira.sf.net for more information.