Hi,

I'm parsing documents with the Xerces DOMParser, modify some nodes and then
want to write these document back to disk. At the moment, there doesn't seem
to be a working solution for this problem. If you leave out my
DOM-processing, the simple question is, whether there is a standard way to
parse a Document into memory via DOMParser and stream it out again so that
both input and output are identical.

1. Serializing with Xerces 1.0.2's XMLSerializer doesn't work
When trying to serialize the DOM-Document with

DOMParser parser = new DOMParser();
parser.parse(input);
Document d = parser.getDocument();
PrintWriter writer = new PrintWriter(.....);
OutputFormat format = new OutputFormat();
format.setMethod(Method.XML);
format.setOmitXMLDeclaration(false);
format.setPreserveSpace(true);
format.setVersion("1.0");
Serializer serializer =
SerializerFactory.getSerializerFactory(Method.XML).makeSerializer(writer,
format);
serializer.asDOMSerializer().serialize(document);

After serializing, the file does not contain a space between the public- and
the systemidentifier. I don't know if this is the only problem, but the
resulting file doesn't parse and is.not identical to the input.

2. When using Xalan 0.19.5, you run into major entity-problems
My file contains entity-references to the standard XHTML-Entity-sets (e.g.
ä) which are declared in a separate file. I don't want to convert these
references to unicode but want to leave them as they are. I tried several
stylesheets with serveral encodings, but wasn't able to produce a propper
output.
Here is a sample XSLT-stylesheet

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

  <xsl:output method="xml" encoding="UTF-8"> <!-- I also tried several other
codes -->
  <xsl:template match="*|@*|comment()|processing-instruction()|text()">
    <xsl:copy>
      <xsl:apply-templates
select="*|@*|comment()|processing-instruction()|text()"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

As you can see, I just do a straight copy-over.

Has anybody run into the same problem before or does anybody have an idea
how to solve this without writing a specialized DOM-Serializer?

Armin


Reply via email to