Hi there,
Responding to Dieter's post below: first I'd like to thank you for your XDOM libraries, which I'm beginning to use for some of my Delphi projects. Great work!
I agree that scholars currently have little to gain from developing tools, from the point of view of their academic careers. However, there's another group of individuals like me, who are working in the academic context as programmers and in similar roles, who do this kind of work for a living. John Bradley would be another good example. Our work initially got going in the realm of language teaching support (Computer-Assisted Language Learning), giving birth to quite widely-used tools such as our Hot Potatoes programs; recently, we've morphed from a Language Centre into a Humanities Computing and Media Centre, and our work is focused more and more on HC, digital documents and encoding. Our experience has been that when we release tools which gain approval and acceptance, the university is generally pleased and appreciative; they've also helped us spin off some of our work commercially, to everyone's financial benefit. While individual departments and tenure committees may not (yet) give much weight to this kind of work, other parts of the university administration are more supportive.
The emergence of centres such as ours, which have relatively stable workforces (as opposed to the ad-hoc temporary hires associated with grant-supported projects) means that tools tend to be rewritten and updated steadily, which gives them more credibility (Hot Potatoes is now at version 6, for example, and has been out since 1997). This model of tool development, where a centre with long-term staff creates tools for the use of several projects, and maintains them over time, is much more likely to be successful than the case where an academic working on a temporarily-funded project hires in programmers to write something for a specific purpose, releases it, then moves on to the next piece of research, leaving the code to languish like an abandoned vehicle in a field (which is what "open-sourcing" a project often turns out to mean).
Cheers, Martin
At 09:37 AM 29/06/2005, you wrote:
Date: Fri, 24 Jun 2005 13:14:24 +0200 From: Dieter K?hler d.k@philo.de Subject: [dm-l] Tools for humanities computing (WAS: Are markup languages obsolete?) To: dm-l@uleth.ca Message-ID: 5.2.1.1.0.20050624114827.025b2050@pop3.philo.de Content-Type: text/plain; charset="iso-8859-1"; format=flowed
I often think that the free "tools" aspect had been underestimated in humanities computing. Of course, there exist a couple of good examples like the TEI XSLT style sheets or some open source archival software etc. Nevertheless, the situation is not satisfactory and I wonder what are its causes and by which means it could possibly be improved. Briefly summarized the main causes, I can think of, are the following:
- Most software development in the humanities takes place in an ad hoc
fashion: People have specific problems and develop specific solutions.
- There is a lack of institutional support for developing tools for others.
Tools are only by-products.
- If there is institutional support for developing tools for others, these
tools need to be sold in order to re-finance the work.
- It is not advisable for a scholar trying to build an academic career on
developing tools for humanities computing.
- There exists no academic infrastructure *focused* on developing tools for
the humanities, ie. a specific society, journal and annual conference.
Since one of my main research interests is considered with the development of tools for humanities computing, I would be very interested in the opinions of others on the above list of causes. Perhaps together we could find ways to improve the situation.
Dieter Köhler
Institute of Philosophy and Centre for Multimedia Studies University of Karlsruhe Germany
______________________________________ Martin Holmes University of Victoria Humanities Computing and Media Centre mholmes@uvic.ca martin@mholmes.com mholmes@halfbakedsoftware.com http://www.mholmes.com http://web.uvic.ca/hcmc/ http://www.halfbakedsoftware.com
I have been involved in a planning committee to prepare suggestions for the technology to be used for creating digital facsimiles of the unpublished manuscripts and typescripts of a scholar (although not medieval, but from the 20th century). After reading some articles and searching the Web for the technical descriptions of similar projects I still have some questions, and I wonder whether someone on this list might be able to give me advice on the following issues:
Which equipment would you recommend for creating 600 dpi color images (TIFF)? Is it better to use a scanner or a digital photo camera for such a project? The great majority of the pages are on standard type paper (DIN A4 and DIN A5) and do not need special care when handling them.
Using the method you recommended, approx. how many pages can be digitized in one hour?
I appreciate your thoughts.
Dieter Köhler
For texts (books) then I would say use a scanner. I know someone who did this with a anthology of Old English by Conybeare. The facs. was very good. If you use a scanner you can produce pages as fast as the scanner can scan them. Its up to you really! Also, I would suggest using Photoshop for your imaging program, because that's the best available.
Abdullah Alger
Quoting Dieter Köhler d.k@philo.de:
I have been involved in a planning committee to prepare suggestions for t
he
technology to be used for creating digital facsimiles of the unpublished manuscripts and typescripts of a scholar (although not medieval, but from the 20th century). After reading some articles and searching the Web for the technical descriptions of similar projects I still have some question
s,
and I wonder whether someone on this list might be able to give me advice on the following issues:
Which equipment would you recommend for creating 600 dpi color images (TIFF)? Is it better to use a scanner or a digital photo camera for such
a
project? The great majority of the pages are on standard type paper (DIN A4 and DIN A5) and do not need special care when handling them.
Using the method you recommended, approx. how many pages can be digitized in one hour?
I appreciate your thoughts.
Dieter Köhler
Digital Medievalist Project Homepage: http://www.digitalmedievalist.org Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cf
m
Wiki: http://sql.uleth.ca/dmorgwiki/index.php Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm Contact editorial Board: digitalmedievalist@uleth.ca dm-l mailing list dm-l@uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l
If the pages are loose, I can scan about 1200 pages/hr to 600dpi Colour TIFFS with a paper-fed Ricoh 2238 on a network. If you have lots of pages, this is the best way. Find a benefactor with a good business scanner and go nuts.
A standard desktop scanner....20-30 pages per hour.
Photoshop is not necessary unless you are doing complex restoration. If it is just for cropping, go to Macromedia.com and download the 30 day trial of Fireworks. There is no difference for basic editing functions and quality.
travis
-----Original Message----- From: dm-l-bounces@uleth.ca [mailto:dm-l-bounces@uleth.ca] On Behalf Of Abdullah Alger Sent: Wednesday, June 29, 2005 3:01 PM To: dm-l@uleth.ca Subject: Re: [dm-l] Equipment for creating digital facsimiles
For texts (books) then I would say use a scanner. I know someone who did this with a anthology of Old English by Conybeare. The facs. was very good. If you use a scanner you can produce pages as fast as the scanner can scan them. Its up to you really! Also, I would suggest using Photoshop for your imaging program, because that's the best available.
Abdullah Alger
Quoting Dieter Köhler d.k@philo.de:
I have been involved in a planning committee to prepare suggestions for t
he
technology to be used for creating digital facsimiles of the unpublished manuscripts and typescripts of a scholar (although not medieval, but from the 20th century). After reading some articles and searching the Web for the technical descriptions of similar projects I still have some question
s,
and I wonder whether someone on this list might be able to give me advice on the following issues:
Which equipment would you recommend for creating 600 dpi color images (TIFF)? Is it better to use a scanner or a digital photo camera for such
a
project? The great majority of the pages are on standard type paper (DIN A4 and DIN A5) and do not need special care when handling them.
Using the method you recommended, approx. how many pages can be digitized in one hour?
I appreciate your thoughts.
Dieter Köhler
Digital Medievalist Project Homepage: http://www.digitalmedievalist.org Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cf
m
Wiki: http://sql.uleth.ca/dmorgwiki/index.php Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm Contact editorial Board: digitalmedievalist@uleth.ca dm-l mailing list dm-l@uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l
_______________________________________________ Digital Medievalist Project Homepage: http://www.digitalmedievalist.org Journal (Spring 2005-): http://www.digitalmedievalist.org/journal.cfm RSS (announcements) server: http://www.digitalmedievalist.org/rss/rss2.cfm Wiki: http://sql.uleth.ca/dmorgwiki/index.php Change membership options: http://listserv.uleth.ca/mailman/listinfo/dm-l Submit RSS announcement: http://www.digitalmedievalist.org/newitem.cfm Contact editorial Board: digitalmedievalist@uleth.ca dm-l mailing list dm-l@uleth.ca http://listserv.uleth.ca/mailman/listinfo/dm-l
On Wed, 2005-29-06 at 16:21 -0700, travis@lacuna.ca wrote:
If the pages are loose, I can scan about 1200 pages/hr to 600dpi Colour TIFFS with a paper-fed Ricoh 2238 on a network. If you have lots of pages, this is the best way. Find a benefactor with a good business scanner and go nuts.
What scanners would people recommend for bulk scanning like this? I have been in the process for several months (well almost a year off and on) of trying to build my own JSTOR: I've been scanning my collection of article photocopies in and scanning them to PDF with OCR text recognition. The actual process works well enough: the OCR is goodish (about as good as JSTOR, probably), and the PDFs high enough quality. The weak link is the automatic document feed on my HP 5590: it quite frequently (maybe once per 5-10 batches of documents) takes two or three sheets at a time. I keep it quite clean, BTW.
What kind of ADF (Auto Document Feed) scanner would people recommend for scanning and OCRing 1000 or so articles? While cost is obviously an issue, I'm going to have to hire somebody to babysit the current setup, so it may all balance out in the end. First prize to anybody who suggests a completely Linux compatible solution. But I've also Windows XP available.
-d
For doing mass scanning, take a look at VueScan (http://www.hamrick.com). I have been using it for years, and it has never disappointed me; it is better than any other scanning interface I have tried. Photoshop is the tool of choice for image editing, but VueScan is much better for scanning. You can download a free trial to see if you agree.
David
Abdullah Alger wrote:
For texts (books) then I would say use a scanner. I know someone who did this with a anthology of Old English by Conybeare. The facs. was very good. If you use a scanner you can produce pages as fast as the scanner can scan them. Its up to you really! Also, I would suggest using Photoshop for your imaging program, because that's the best available.
Hi all,
I tend to agree on Martin Holmes point. Apparently however, he's talking from a luxurious position. Wow, having your own Humanities Computing Centre, with stable funding and reliable job opportunities for researcher/programmers. Brilliant, I'm truly jealous! In The Netherlands (and I guess in other countries too) we're only having a toehold on stable funding for computational aspects in Humanities. Most programming, research and development in Humanities Computing is still ad hoc and moreover ad hoc funded. Imagine: lone researchers that somewhere along the way grasped the potentials of information technology cowboy coding away. Do not even think of stable coding environments, shared languages, good programming practices and standards compliance! (Okay, I might be exaggerating somewhat for the benefit of clearness:)
It's of course only in proving the added value of computational approaches that we should be able to gain firmer ground. But in that case it would clearly help us a lot when building tools would also benefit one's academic esteem.
Having said that, I totally disagree with Martin's last point. Open sourcing is not about abandoning your source code. It's about giving the Humanities community insight in what you are doing for benefit of academic review and control. Research based on computation should always be reproducible and controllable by peers. Also peers should be able to do code review to control the exact and correct working of any algorithm. That can only be accomplished by open sourcing your code bases.
y.s., Joris van Zundert