[dm-l] Character reference conversions: help?

Roberto Rosselli Del Turco rosselli at ling.unipi.it
Tue Oct 5 22:29:10 MDT 2004

Il giorno mar, 05-10-2004 alle 12:53 -0600, Daniel O'Donnell ha scritto:
> Digital Medievalist Journal (Inaugural Issue Fall 2004). Call for papers: http://www.digitalmedievalist.org/cfp.htm
> ----------------
> Hello all,
> 	I need some advice on converting Unicode character references. 
> Currently, am encoding character references in what I believe is UCS-4 
> format (Universal Character Set). This means they look like this in my 
> source files:
> ႐
> I want to import xhtml documents into Open Office, which seems to need 
> UTF-8 encoding (I don't know what UTF stands for). Does anybody know of 
> a filter that might do the conversions for me? Or have advice on using 
> open office (Windows version) with UCS-4 encoding?

Can't you just copy and paste your documents from
Mozilla/Firefox/whatever into OOo? I know, this looks too simple to be
true ... but I just tried[1] and it works!


[1] Picked up an xhtml file, inserted random decimal entities, loaded it
in Epiphany (based on Mozilla's engine), copied text and pasted it into
a unicode text editor: I ended up with unicode characters.

Roberto Rosselli Del Turco      roberto.rossellidelturco at unito.it
Dipartimento di Scienze         rosselli at ling.unipi.it
del Linguaggio                  Then spoke the thunder  DA
Universita' di Torino           Datta: what have we given?  (TSE)
  Hige sceal the heardra,     heorte the cenre,
  mod sceal the mare,       the ure maegen litlath.  (Maldon 312-3)

More information about the dm-l mailing list