Dan,
If you process the file with an XSLT script that looks like this:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version = '1.0' xmlns="http://www.w3.org/1999/xhtml" xmlns:xsl="http://www.w3.org/1999/XSL/Transform%22%3E
<xsl:output method="xml" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd%22/%3E
<xsl:template match="/"> <xsl:copy-of select="."/> </xsl:template>
</xsl:stylesheet>
then your entities should all get converted to UTF-8 automagically.
Peter
Daniel O'Donnell wrote:
Digital Medievalist Journal (Inaugural Issue Fall 2004). Call for papers: http://www.digitalmedievalist.org/cfp.htm
Hello all, I need some advice on converting Unicode character references. Currently, am encoding character references in what I believe is UCS-4 format (Universal Character Set). This means they look like this in my source files:
႐
I want to import xhtml documents into Open Office, which seems to need UTF-8 encoding (I don't know what UTF stands for). Does anybody know of a filter that might do the conversions for me? Or have advice on using open office (Windows version) with UCS-4 encoding?
-dan