DO NOT REPLY [Bug 16968] - XercesDOMParser generates whitespace text nodes

bugzilla Tue, 11 Feb 2003 10:38:30 -0800

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16968>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.


http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16968

XercesDOMParser generates whitespace text nodes





------- Additional Comments From [EMAIL PROTECTED]  2003-02-11 18:40 -------
You're right, text nodes don't have a tag name. Tag names go with elements; text
nodes are not elements.

The only way you can tell Xerces to whether whitespace is significant is as
Gareth has suggested. A general-purpose XML processing library should not
arbitrarily remove whitespace, as it may change the meaning of some documents.
Without a description of the document's structure (that is, a DTD) and explicit
instructions, any mechanism will in fact be arbitrary.

If you don't want to write a DTD, you'll need to traverse the tree and remove
any whitespace you don't want, based on your own logic that determines whether
or not it's needed. Recognize that in the absence of a DTD, any cleaning up you
do is an ad-hoc courtesy that may or may not reflect the document's original
intent. This may be reasonably safe if documents never leave your control, but
it could get you into trouble if you find you must exchange documents with
others at some point. Formalizing the document structure in a DTD helps avoid
trouble by spelling out your intent.

A final, completely reasonable alternative is to do no special whitespace
processing. If some source supplies you with pretty-printed documents, they must
recognize that they have changed the document content.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

DO NOT REPLY [Bug 16968] - XercesDOMParser generates whitespace text nodes

Reply via email to