See http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13840 for another
implementation.

-----Original Message-----
From: Kevin King [mailto:[EMAIL PROTECTED]]
Sent: Thursday, November 21, 2002 2:12 AM
To: [EMAIL PROTECTED]
Subject: FormatPrettyPrint implementation


Hello,

I should probably first issue the disclaimer that as of a few days ago I 
did not know any details about XML, nor had even heard of Xerces.   I have 
however been able to very quickly integrate Xerces-C++ into my application 
and get some basic XML functionality working using the DOM API.

I obtained unexpected results when setting FormatPrettyPrint to serialize a 
document that was created from scratch within my application.  Quick 
examination of the DomWriter implementation and searching through the 
archives of the xerces-c-dev list confirmed that there was no user error 
and it was functioning as designed.  So this evening I modified  DomWriter 
to format its output "Pretty".

The fact that I was able to get basic XML working within my application, 
and even add some functionality in a matter of a couple days is a testament 
to everyone that has worked on this project - I was very impressed at how 
easy it was to write code based from the provided samples and even edit the 
source.  Everything is very well organized and documented.

My implementation of PrettyPrint seems to work with some random XML files I 
was able to find.  But I will not begin to suggest it is a complete or 
working solution since my knowledge of XML is minimal.

I came up with a few rules, added to DOMWriterImpl::processNode() which 
seem to do the trick when PrettyPrint is enabled:

         1) All text nodes that contain ONLY whitespace are ignored

         2) Each tag begins on a new line, indented a variable amount based 
on its level.  A level is defined as how many generations removed from the 
root element it is.

         3) Closing tags for Element nodes are printed on the same line as 
the opening if no newlines have been output as the result of any 
children.  Otherwise closing tags are printed on a newline indented the 
same level as the opening tag.

         4) An empty newline is printed just before the tag for each child 
of the root node.


Currently I have the amount of indenting to be hard coded to two blank 
spaces per level.  This should be user configurable in a final
implementation.

Now my concern is that rule #1 may not fly.  I do not know enough about XML 
to know if that will incorrectly ignore some valid data.  From all the XML 
samples I could find, the only time that a text node contained only 
whitespace was when it was in between an element's close tag and the next 
element's open tag, thus providing a readable format.  I decided that it is 
best to ignore all existing formatting when FormatPrettyPrint  is enabled 
as any attempt to combine the two would be too complex and create an 
unpredictable output.

Rules 2, 3, and 4 are just my own preference in what I think looks good, 
and they were very easy to implement.

I do not know if anyone was working on this but the following thread seemed 
to indicate it was not, as the only more recent discussions were people 
indicating that FormatPrettyPrint produced unexpected results.
         http://marc.theaimsgroup.com/?l=xerces-c-dev&m=102760381301304&w=2


I would like to hear any comments on the above.  And would also not mind 
receiving some sample XML to run through DomWriter to see if it handles it 
with FormatPrettyPrint on.  I am more than willing to share any of these 
changes, and add to them any oversights that I had.

-Kevin King


Sample output using the "personal.xml" file provided in the samples.  I 
removed 3 of the users for a briefer sample:
         "domprint.exe -wfpp=on personal.xml"

<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?>
<!DOCTYPE personnel>
<!-- @version: -->


<personnel>
   <person id="Big.Boss">
     <name>
       <family>Mr Boss</family>
       <given>Big</given>
     </name>
     <email>[EMAIL PROTECTED]</email>
     <link subordinates="one.worker two.worker"/>
   </person>

   <person id="one.worker">
     <name>
       <family>Worker</family>
       <given>One</given>
     </name>
     <email>[EMAIL PROTECTED]</email>
     <link manager="Big.Boss"/>
   </person>

   <person id="two.worker">
     <name>
       <family>Worker</family>
       <given>Two</given>
     </name>
     <email>[EMAIL PROTECTED]</email>
     <link manager="Big.Boss"/>
   </person>

</personnel>



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to