[dm-l] Character reference conversions: help?
Peter Baker
psb6m at virginia.edu
Tue Oct 5 16:53:08 MDT 2004
Dan,
If you process the file with an XSLT script that looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version = '1.0'
xmlns="http://www.w3.org/1999/xhtml"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" doctype-public="-//W3C//DTD XHTML 1.0
Transitional//EN"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>
<xsl:template match="/">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
then your entities should all get converted to UTF-8 automagically.
Peter
Daniel O'Donnell wrote:
> Digital Medievalist Journal (Inaugural Issue Fall 2004). Call for
> papers: http://www.digitalmedievalist.org/cfp.htm
> ----------------
> Hello all,
> I need some advice on converting Unicode character references.
> Currently, am encoding character references in what I believe is UCS-4
> format (Universal Character Set). This means they look like this in my
> source files:
>
> ႐
>
> I want to import xhtml documents into Open Office, which seems to need
> UTF-8 encoding (I don't know what UTF stands for). Does anybody know
> of a filter that might do the conversions for me? Or have advice on
> using open office (Windows version) with UCS-4 encoding?
>
> -dan
More information about the dm-l
mailing list