Re: [dm-l] The time has come to make some <choice>s

29 Jun 2004


      On Tue, 29 Jun 2004, Martin Holmes wrote:
...
Hi there,
At 02:41 PM 29/06/2004, you wrote:
...
Well, there could be a third area of discussion: What is the status of an
electronic text and what does the borderline between "text" and "markup"
really mean (for this case)? The problem here could be, that - if you
believe in the ontological discrimination of "text" and "markup" - you seem
to double the portion of "text" in question. But this is no question of
practical relevance and only leads to a philosophical sophistry which maybe
should better be left to an even more specialised debate (and my
forthcoming PhD-thesis ;-)) ...
This is a fascinating topic. I'd argue that markup and its content is just
data; "texts" are generated from markup using specific transformations for
specific audiences or purposes. Given this:
<corr>Martin</corr><sic>Marnit</sic>
one "text" might show "Martin" with a mouseover popup indicating the
misspelling in the original source, and another might show "Marnit" with a
mouseover explaining the assumed correct form. The differences embody
editorial approaches and purposes, and it's these that give birth to texts.
The markup merely strives after completeness and transparency.
There is the feeling that many have that once you strip away the markup, 
you should be left with a bare version of 'the text'.  They argue that 
all such alternatives should be stored in attributes (even with the
aforementioned problems) in order to separate the interpretation from 
the text.  But the problems with this are legion.  Aside from the
obvious need for markup inside these interpretative readings, the 
choice of markup itself is, of course, an interpretation of structure 
that they are imposing on the text. (That way lies overlapping 
hierarchy discussion again...)  Moreover, the same process of 
stripping away the markup to reveal 'the text' is still simplistically 
possible, it is just that the act of 'stripping' in this case doesn't 
mean 'remove the tags' but instead process them so that 'the text' 
is the result.  Whether 'the text' is with corrections made,
abbreviations expanded, spelling regularised, or any of the other 
possible applications of this <choice> type of encoding is a 
decision that is made at the point of processing.  But, I'm sure 
you know all this and I'm preaching to the converted. ;-)
-James
---
Dr James Cummings, Oxford Text Archive, University of Oxford
James dot Cummings at ota dot ahds dot ac dot uk

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [dm-l] The time has come to make some <choice>s