Hello,
I have had that problem (writing bits of my document back to a file) some time
ago using xml4c.
I solved it with converting the DOMString to simple char*.
This function served my puposes in converting iso latin-1 and utf-8 encoded
sources. I�m not sure if it's a help for anybody else, but feel free to use it.
--------------------------------------------------------
void DOMString2CharP(DOMString in, char* out)
{
int l=in.length(),j=0;
XMLCh c;
for (int i=0; i<l; i++)
{
c = in.charAt(i);
if (<ISO LATIN-1>) // ISO LATIN-1 instead of US ASCII 7BIT
{
switch (c)
{
case 196: out[j] = 'A';j++;out[j] = 'e'; //� = FFC4 -> C4 = 196
break;
case 228: out[j] = 'a';j++;out[j] = 'e'; //� = FFE4 -> E4 = 228
break;
case 214: out[j] = 'O';j++;out[j] = 'e'; //� = FFD6 -> D6 = 214
break;
case 246: out[j] = 'o';j++;out[j] = 'e'; //� = FFF6 -> F6 = 246
break;
case 220: out[j] = 'U';j++;out[j] = 'e'; //� = FFDC -> DC = 220
break;
case 252: out[j] = 'u';j++;out[j] = 'e'; //� = FFFC -> FC = 252
break;
case 223: out[j] = 's';j++;out[j] = 's'; //� = FFDF -> DF = 223
break;
default: out[j] = (char) c; //just use the lower byte
}
}
j++;
}
out[j]='\0';
}
--------------------------------------------------------
Armin Pfarr wrote:
> Hi,
>
> I'm parsing documents with the Xerces DOMParser, modify some nodes and then
> want to write these document back to disk. At the moment, there doesn't seem
> to be a working solution for this problem. If you leave out my
> DOM-processing, the simple question is, whether there is a standard way to
> parse a Document into memory via DOMParser and stream it out again so that
> both input and output are identical.
>
> 1. Serializing with Xerces 1.0.2's XMLSerializer doesn't work
> When trying to serialize the DOM-Document with
>
> DOMParser parser = new DOMParser();
> parser.parse(input);
> Document d = parser.getDocument();
> PrintWriter writer = new PrintWriter(.....);
> OutputFormat format = new OutputFormat();
> format.setMethod(Method.XML);
> format.setOmitXMLDeclaration(false);
> format.setPreserveSpace(true);
> format.setVersion("1.0");
> Serializer serializer =
> SerializerFactory.getSerializerFactory(Method.XML).makeSerializer(writer,
> format);
> serializer.asDOMSerializer().serialize(document);
>
> After serializing, the file does not contain a space between the public- and
> the systemidentifier. I don't know if this is the only problem, but the
> resulting file doesn't parse and is.not identical to the input.
>
> 2. When using Xalan 0.19.5, you run into major entity-problems
> My file contains entity-references to the standard XHTML-Entity-sets (e.g.
> ä) which are declared in a separate file. I don't want to convert these
> references to unicode but want to leave them as they are. I tried several
> stylesheets with serveral encodings, but wasn't able to produce a propper
> output.
> Here is a sample XSLT-stylesheet
>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
>
> <xsl:output method="xml" encoding="UTF-8"> <!-- I also tried several other
> codes -->
> <xsl:template match="*|@*|comment()|processing-instruction()|text()">
> <xsl:copy>
> <xsl:apply-templates
> select="*|@*|comment()|processing-instruction()|text()"/>
> </xsl:copy>
> </xsl:template>
> </xsl:stylesheet>
>
> As you can see, I just do a straight copy-over.
>
> Has anybody run into the same problem before or does anybody have an idea
> how to solve this without writing a specialized DOM-Serializer?
>
> Armin
--
___________________________________________________________________________
ProSTEP GmbH Phone: +49-6151-9287381
Thomas Conradi Fax: +49-6151-9287381
Julius-Reiber Str. 15 Email: [EMAIL PROTECTED]
D-64293 Darmstadt
___________________________________________________________________________