Hello,
I should probably first issue the disclaimer that as of a few days ago I
did not know any details about XML, nor had even heard of Xerces. I have
however been able to very quickly integrate Xerces-C++ into my application
and get some basic XML functionality working using the DOM API.
I obtained unexpected results when setting FormatPrettyPrint to serialize a
document that was created from scratch within my application. Quick
examination of the DomWriter implementation and searching through the
archives of the xerces-c-dev list confirmed that there was no user error
and it was functioning as designed. So this evening I modified DomWriter
to format its output "Pretty".
The fact that I was able to get basic XML working within my application,
and even add some functionality in a matter of a couple days is a testament
to everyone that has worked on this project - I was very impressed at how
easy it was to write code based from the provided samples and even edit the
source. Everything is very well organized and documented.
My implementation of PrettyPrint seems to work with some random XML files I
was able to find. But I will not begin to suggest it is a complete or
working solution since my knowledge of XML is minimal.
I came up with a few rules, added to DOMWriterImpl::processNode() which
seem to do the trick when PrettyPrint is enabled:
1) All text nodes that contain ONLY whitespace are ignored
2) Each tag begins on a new line, indented a variable amount based
on its level. A level is defined as how many generations removed from the
root element it is.
3) Closing tags for Element nodes are printed on the same line as
the opening if no newlines have been output as the result of any
children. Otherwise closing tags are printed on a newline indented the
same level as the opening tag.
4) An empty newline is printed just before the tag for each child
of the root node.
Currently I have the amount of indenting to be hard coded to two blank
spaces per level. This should be user configurable in a final implementation.
Now my concern is that rule #1 may not fly. I do not know enough about XML
to know if that will incorrectly ignore some valid data. From all the XML
samples I could find, the only time that a text node contained only
whitespace was when it was in between an element's close tag and the next
element's open tag, thus providing a readable format. I decided that it is
best to ignore all existing formatting when FormatPrettyPrint is enabled
as any attempt to combine the two would be too complex and create an
unpredictable output.
Rules 2, 3, and 4 are just my own preference in what I think looks good,
and they were very easy to implement.
I do not know if anyone was working on this but the following thread seemed
to indicate it was not, as the only more recent discussions were people
indicating that FormatPrettyPrint produced unexpected results.
http://marc.theaimsgroup.com/?l=xerces-c-dev&m=102760381301304&w=2
I would like to hear any comments on the above. And would also not mind
receiving some sample XML to run through DomWriter to see if it handles it
with FormatPrettyPrint on. I am more than willing to share any of these
changes, and add to them any oversights that I had.
-Kevin King
Sample output using the "personal.xml" file provided in the samples. I
removed 3 of the users for a briefer sample:
"domprint.exe -wfpp=on personal.xml"
<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?>
<!DOCTYPE personnel>
<!-- @version: -->
<personnel>
<person id="Big.Boss">
<name>
<family>Mr Boss</family>
<given>Big</given>
</name>
<email>[EMAIL PROTECTED]</email>
<link subordinates="one.worker two.worker"/>
</person>
<person id="one.worker">
<name>
<family>Worker</family>
<given>One</given>
</name>
<email>[EMAIL PROTECTED]</email>
<link manager="Big.Boss"/>
</person>
<person id="two.worker">
<name>
<family>Worker</family>
<given>Two</given>
</name>
<email>[EMAIL PROTECTED]</email>
<link manager="Big.Boss"/>
</person>
</personnel>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
- Re: FormatPrettyPrint implementation Kevin King
- Re: FormatPrettyPrint implementation Gareth Reakes
- RE: FormatPrettyPrint implementation Jesse Pelton
- RE: FormatPrettyPrint implementation Kevin King
- Re: FormatPrettyPrint implementation Duncan_Stodart
- Re: FormatPrettyPrint implementation Gareth Reakes
- Re: FormatPrettyPrint implementation Jason E. Stewart
- RE: FormatPrettyPrint implementation Jesse Pelton
- Re: FormatPrettyPrint implementation Duncan_Stodart
- Re: FormatPrettyPrint implementation Andreas Oesterer
- Re: FormatPrettyPrint implementation Gareth Reakes
