Dear all, Apologies for cross-posting.
Please find below the details of next week's CeRch seminar:
Thinking Big: escaping the Small Data fallacy in Historical Linguistics (Gard Jenset and Barbara McGillivray, University of Oxford/University of Bergen/Oxford University Press)
Tuesday, October 29th, 2013 from 6:15 PM to 7:30 PM (GMT) Anatomy Theatre and Museum, King's College London: http://www.kcl.ac.uk/cultural/atm/location.aspx
Attendance is free and open to all, but registration is requested: https://www.eventbrite.com/event/8348441413
The seminar will be followed by wine and nibbles. All the best, Valentina Asciutti
Abstract: Historical Linguistics studies the evolution of historical languages and earlier stages of living languages. By necessity, historical linguistics has traditionally been based on the analytical study of exemplars from written collections of texts. Given the technological constraints and the aims of historical comparative philology of the 19th and early 20th centuries, the reliance on qualitative assessment of a few exemplars was justified. Towards the end of the last century, formalized collections of texts known as corpora bloomed with the advent of computer technology. This made it feasible to automatically create very large corpora (> 100 million words), annotate them various linguistic information, and efficiently search and systematically retrieve information from them. However, technological advances can only change a field if they find their place in an appropriate methodological framework. The qualitative methods of comparative philology (manually searching for exemplars) underutilize the information available in today's historical corpora, and contemporary historical linguistics is still largely based on the traditional methodology. Few things are more commonly taken for granted in historical linguistics than the assumption that the researcher should eyeball every piece of data relevant to her analysis. We disagree with this position, which we will henceforth refer to as the Small Data Fallacy. Instead, we believe that methods inspired by Big Data can and should influence Historical Linguistics, and that such a move would entail a qualitative leap forwards in Historical Linguistics research methods. We will discuss some of the benefits, challenges, and limitations of applying the Big Data framework to historical linguistics. We will also touch upon the impact this would have on the fundamental aims of Historical Linguistics in the21st century.
Bios: Gard Jenset is a visiting scholar in the Faculty of Linguistics, Philology and Phonetics, University of Oxford, with research interests in historical corpus linguistics, corpora in ELT, statistics and quantitative research methods in linguistics, corpus methods for semantics and cognitive linguistics. Barbara McGillivray is a computational linguist, and works as a language engineer in the Dictionary Department of Oxford University Press. She holds a PhD in computational linguistics form the University of Pisa. Her research interests include: Language Technology for Cultural Heritage, Latin computational linguistics, quantitative historical linguistics, and computational lexicography.
NB This seminar will not be live-streamed.