"Stefan Berglund" <[EMAIL PROTECTED]> writes:
> I've tried to create entities in the DTD like: <!ENTITY aring "å"> and
> then it works better - getNodeValue on "å" then gets me the right
> character.. But using the DOMPrint code on a document as that gives me
> "&ring;"...
It's kind of irritating to have to do that, given that 8859-1 is a
standard encoding, the parsers should recognize å if your're
encoding is set properly. I had a lot of trouble with 'è',
'é', etc. when converting Roget's thesaurus to XML. I finally
had to declare them in the DTD as you had to.
Strange that DOMPrint screws it up. I tested a small piece of my
roget.xml against SAXPrint, SAX2Print, and DOMPrint. DOMPrint got it
wrong, the other two got it correct.
$ SAXPrint -v=auto /tmp/tst.xml
<?xml version="1.0" encoding="LATIN1"?>
<thesaurus>
<section title="THESAURUS OF ENGLISH WORDS AND PHRASES">
<major id="major_383" name="Cold">
<minor part_of_speech="NOUN">
<synonym-group>
<related-synonym>
<synonym name="nev�e"></synonym>
<synonym name="serac" comments="obs3"></synonym>
</related-synonym>
</synonym-group>
</minor>
</major>
</section>
</thesaurus>
$ DOMPrint -v=auto /tmp/tst.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE thesaurus SYSTEM "roget.dtd">
<thesaurus>
<section title="THESAURUS OF ENGLISH WORDS AND PHRASES">
<major id="major_383" name="Cold">
<minor part_of_speech="NOUN">
<synonym-group>
<related-synonym>
<synonym name="nevée"/>
<synonym name="serac" comments="obs3"/>
</related-synonym>
</synonym-group>
</minor>
</major>
</section>
</thesaurus>
$ SAX2Print -v=auto !$
SAX2Print -v=auto /tmp/tst.xml
<?xml version="1.0" encoding="LATIN1"?>
<thesaurus>
<section title="THESAURUS OF ENGLISH WORDS AND PHRASES">
<major id="major_383" name="Cold">
<minor part_of_speech="NOUN">
<synonym-group>
<related-synonym>
<synonym name="nev�e"></synonym>
<synonym name="serac" comments="obs3"></synonym>
</related-synonym>
</synonym-group>
</minor>
</major>
</section>
</thesaurus>
Looks like DOMPrint is buggered.
jas.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]