Institute of Classical Studies Senate House, Malet Street, London WC1E 7HU Friday July 27, 2018 at 16:30 in room 234
*Patrick J. Burns (New York University) *Backoff Lemmatization for Ancient Greek with the Classical Language Toolkit
Automated lemmatization, or retrieval of dictionary headwords, is an active area of research in historical-language text analysis. In this talk, I describe the development of the Backoff Lemmatizer for Ancient Greek with the Classical Language Toolkit (CLTK), an open-source Python platform dedicated to developing natural language processing tools for historical languages. The Backoff Lemmatizer seeks to improve on existing tools by combining training-data-based and rules-based tagging as a lemmatization strategy. By way of conclusion, I discuss a current CLTK development strategy, namely the use of object-oriented architecture as an avenue to digital comparative philology.
Livestream: https://youtu.be/o0neelgQlw8
Full abstract: http://digitalclassicist.org/wip/wip2018.html
ALL WELCOME
== Dr Gabriel BODARD Reader in Digital Classics
Institute of Classical Studies University of London Senate House Malet Street London WC1E 7HU
E: Gabriel.bodard@sas.ac.uk T: +44 (0)20 78628752