[dm-l] Character reference conversions: help?

Peter Baker psb6m at virginia.edu
Tue Oct 5 16:53:08 MDT 2004


Dan,

If you process the file with an XSLT script that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version = '1.0'
     xmlns="http://www.w3.org/1999/xhtml"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 
<xsl:output method="xml" doctype-public="-//W3C//DTD XHTML 1.0 
Transitional//EN"            
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>
 
<xsl:template match="/">
  <xsl:copy-of select="."/>
</xsl:template>
 
 
</xsl:stylesheet>

then your entities should all get converted to UTF-8 automagically.

Peter

Daniel O'Donnell wrote:

> Digital Medievalist Journal (Inaugural Issue Fall 2004). Call for 
> papers: http://www.digitalmedievalist.org/cfp.htm
> ----------------
> Hello all,
>     I need some advice on converting Unicode character references. 
> Currently, am encoding character references in what I believe is UCS-4 
> format (Universal Character Set). This means they look like this in my 
> source files:
>
> &#x1090;
>
> I want to import xhtml documents into Open Office, which seems to need 
> UTF-8 encoding (I don't know what UTF stands for). Does anybody know 
> of a filter that might do the conversions for me? Or have advice on 
> using open office (Windows version) with UCS-4 encoding?
>
> -dan





More information about the dm-l mailing list