[dm-l] Character reference conversions: help?

Peter Baker psb6m at virginia.edu
Tue Oct 5 16:53:08 MDT 2004


If you process the file with an XSLT script that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version = '1.0'
<xsl:output method="xml" doctype-public="-//W3C//DTD XHTML 1.0 
<xsl:template match="/">
  <xsl:copy-of select="."/>

then your entities should all get converted to UTF-8 automagically.


Daniel O'Donnell wrote:

> Digital Medievalist Journal (Inaugural Issue Fall 2004). Call for 
> papers: http://www.digitalmedievalist.org/cfp.htm
> ----------------
> Hello all,
>     I need some advice on converting Unicode character references. 
> Currently, am encoding character references in what I believe is UCS-4 
> format (Universal Character Set). This means they look like this in my 
> source files:
> &#x1090;
> I want to import xhtml documents into Open Office, which seems to need 
> UTF-8 encoding (I don't know what UTF stands for). Does anybody know 
> of a filter that might do the conversions for me? Or have advice on 
> using open office (Windows version) with UCS-4 encoding?
> -dan

More information about the dm-l mailing list