RE: Problem with DOMParser

Arnold, Curt 30 Apr 2001 16:11:10 -0000

> My understanding is that white space outside of tags
> is not significant and between tags IS significant.


If you know that you are in element content.

> 
> So if you had a pretty-printed XML doc, the CRLF and
> tags would be ignorable, but the stuff between tags
> can't be thrown away:
> 
> <x>
>    <y> Can't toss the leading or trailing blanks </y>
> <!-- tabs and newlines can be ignored -->
> </x>
> 
> If M$ is assuming that whitespace in element content
> is ignorable, I'd have to disagree. - MOD

Sorry, I wasn't trying to give a description of the
algorithm that MSXML uses to determine when whitespace
is ignorable when it does not have a Document Type
Description, other than they use an algorithm and though
they guess well, it is still a guess.

The description of the behavior of 
javax.xml.parsers.DocumentBuilderFactory.setIgnoringElementContentWhitespace
describes the behavior for JAXP compatible parsers in more detail.

setIgnoringElementContentWhitespace

public void setIgnoringElementContentWhitespace(boolean whitespace)

Specifies that the parsers created by this factory must eliminate
whitespace in element content (sometimes known loosely as 'ignorable
whitespace') when parsing XML documents (see XML Rec 2.10). Note that
only whitespace which is directly contained within element content that
has an element only content model (see XML Rec 3.2.1) will be eliminated. 
Due to reliance on the content model this setting requires
the parser to be in validating mode. By default the value of this is set
to false.

Parameters:
whitespace - true if the parser created must eliminate whitespace in the
element content when parsing XML documents; false otherwise.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Problem with DOMParser

Reply via email to