Simple and effective solution (that works for XHTML): 1. If your element contains text content (i.e. #PCDATA) then all contents inside the element is considered information. The parser will not ignore it and the serializer will not add additional spaces.
2. If your element contains element content only (i.e. not #PCDATA) then all spaces are not considered information, the parser should ignore them, the serializer can add extra spaces to pretty print. This schema works well for XML and XHTML, <P> is defined in XHTML to have #PCDATA content for that reason. arkin > My instinct would be to agree but remember our HTML heritage: I'm not sure > that it's possible to defined what white-space is "ignorable" in a general > way. In data-oriented applications of XML fierce normalization of space is > probably desirable but in applications that are dealing more with marked-up > text (I guess that XML is trying to support these too) you've got to > preserve more white space. Consider > <B>bold</B> <I>italic</I> > That's going to map to three DOM nodes: > <element name='B' value='bold'/> > <text value=' '/> > <element name='I' value='italic'/> > Although the second node is a "white-space text-node" and so is, in general, > a prime candidate for pruning it can't be pruned in this markup-centric > [yeeuch!] application. > > -- jP -- > > This message is for the named person's use only. It may contain > confidential, proprietary or legally privileged information. No > confidentiality or privilege is waived or lost by any mistransmission. > If you receive this message in error, please immediately delete it and all > copies of it from your system, destroy any hard copies of it and notify the > sender. You must not, directly or indirectly, use, disclose, distribute, > print, or copy any part of this message if you are not the intended > recipient. CREDIT SUISSE GROUP, CREDIT SUISSE FIRST BOSTON, and each of > their subsidiaries each reserve the right to monitor all e-mail > communications through its networks. Any views expressed in this message > are those of the individual sender, except where the message states > otherwise and the sender is authorised to state them to be the views of > any such entity.