[dm-l] Character reference conversions: help?
mmcgilli at ucalgary.ca
Tue Oct 5 14:46:45 MDT 2004
That's what occurred to me immediately. Should work with any
Unicode-based browser. UTF-8 is the more compact 8-bit Unicode
Transformation Format, which encodes each Unicode character in one or
more octets, using basically one octet in ASCII encoding for the most
common Latin characters and so saving space or width.
Roberto Rosselli Del Turco wrote:
>Digital Medievalist Journal (Inaugural Issue Fall 2004). Call for papers: http://www.digitalmedievalist.org/cfp.htm
>Il giorno mar, 05-10-2004 alle 12:53 -0600, Daniel O'Donnell ha scritto:
>>Digital Medievalist Journal (Inaugural Issue Fall 2004). Call for papers: http://www.digitalmedievalist.org/cfp.htm
>> I need some advice on converting Unicode character references.
>>Currently, am encoding character references in what I believe is UCS-4
>>format (Universal Character Set). This means they look like this in my
>>I want to import xhtml documents into Open Office, which seems to need
>>UTF-8 encoding (I don't know what UTF stands for). Does anybody know of
>>a filter that might do the conversions for me? Or have advice on using
>>open office (Windows version) with UCS-4 encoding?
>Can't you just copy and paste your documents from
>Mozilla/Firefox/whatever into OOo? I know, this looks too simple to be
>true ... but I just tried and it works!
> Picked up an xhtml file, inserted random decimal entities, loaded it
>in Epiphany (based on Mozilla's engine), copied text and pasted it into
>a unicode text editor: I ended up with unicode characters.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the dm-l