Hi, I have not heard from the person who originally responded to me about this issue for a couple weeks. Does anybody on this list happen to have any insight into this issue?
Thanks, David -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 22, 2003 2:56 PM To: [EMAIL PROTECTED] Subject: DO NOT REPLY [Bug 21415] - bug in XercesDOMParser treatment of whitespace and empty elements DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=21415>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=21415 bug in XercesDOMParser treatment of whitespace and empty elements ------- Additional Comments From [EMAIL PROTECTED] 2003-07-22 19:55 ------- PeiYong, Thanks for the reply. I have embedded my comments inside a copy of your e-mail on lines that do not begin with > below. > 1. First I modify the sample MemParse, > . add the line > #include <xercesc/parsers/XercesDOMParser.hpp> > . change the line > SAXParser* parser = new SAXParser; to > XercesDOMParser *parser = new XercesDOMParser; > . comment out > parser->setValidationScheme(valScheme); and > parser->setValidationScheme(valScheme); > . change the line > static const char* gXMLInMemBuf = "<outer> <a></a><b>\n</b> </outer>"; > > and result the sample, the result is something like this > <outer> <a></a><b> > </b> </outer> Looking at MemParse.cpp reveals that the output it gives comes from the following code: " cout << "\nFinished parsing the memory buffer containing the following " << "XML statements:\n\n" << gXMLInMemBuf " (line 338 in the original MemParse.cpp) ...it's just outputting the exact value that we set earlier (basically whatever you specified in the last step of "1."). So it appears that this particular test doesn't exercise the functionality in question. > 2. Second, I put your string into a file, and feed it to the sample DOMPrint > and i have the same result as #1. Sorry, I forgot to specify that "pretty print" should be turned on. So I guess a good test here would be: 1) Create a file named test.xml with the following: " <outer> <a></a> <b> </b> </outer> " (in other words: "<outer> <a></a> <b>\n</b> </outer>") 2) run the following at a command prompt: domprint -wfpp=on test.xml and pipe the result into test_2.xml (e.g. "domprint -wfpp=on test.xml >test2.xml") 3) run the following at a command prompt: domprint -wfpp=on test_2.xml and pipe the result into test_3.xml. You should see the following: ---test.xml---> " <outer> <a></a> <b> </b> </outer> " ---test_2.xml---> " <outer> <a/> <b></b> </outer> " ---test_3.xml---> " <outer> <a/> <b/> </outer> " Note that test_3 merely took the output of test_2 and passed it through the same code again. I have not had time to figure out how Xerces work and where the code for this stuff is located, but--It's as if, somewhere in the Xerces code, something is taking a first pass and changing occurances of "<b></b>" into "<b/>", and *then* it's dropping whitespace...and now we have a situation where "<b>\n</b>" was changed into "<b></b>", but it never gets hit with the change of "<b/>" because that would've happened during a step that already occured earlier. If that is the case, then perhaps the step that changes "<b></b>" into "<b/> just need to be moved to someplace after the step that removes the whilespace (and anything else that could introduce that condition)? > 3. Last, in fact, the pair <b></b> is quivalent to <b/>, and there is **no > space** in between <b></b>. Sorry, could you clarify this last thing for me? I agree that given "<b></b>" and "<b/>" Xerces gives you a consistent "<b/>" output for both of them, but the issue is *when there's whitespace*...so the example I give is "<b>\n</b>" and "<b/>" are *not* being treated as the same thing. Also, what exactly are you referring to when you say "in between <b></b>"? I apologize if I'm just misreading or something... Thanks again, David --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]