On Tue, Apr 23, 2002 at 12:22:31PM -0400, Tito Burgos wrote: > I would like to know if there is a way to remove all "ignorable whitespace" > from an XML document when you parse it to create a document object? > > I thought I was close by using > DocumentBuilderFactory.setIgnoringElementContentWhiteSpace(true) however, > this requires you to also set DocumentBuilderFactory.setValidating(true). > When validating is turned on it expects to validate against a DTD, I'm not > using DTD's I just want to eliminate CR's and other unnecessary whitespace. > > For example, > turn this original xml file: > <root> > <elem1>somevalue</elem1> > <elem2>some other value</elem2> > </root> > > to this document object: > <root><elem1>somevalue</elem1><elem2>some other value</elem2></root>
I think the problem here is that the parser does not know what whitespace is ignorable unless it is validating. The whitespace that you seek to suppress here would be significant if <root> was of mixed-content. In order for the parser to realise that this was ignorable it would have to make the assumption that there is no mixed content in your document. Though this assumption would be true in a lot of cases, some people use mixed content, so the parser cannot assume otherwise. I wonder how easy it would be to write a filter that assumed that there was no mixed content and filtered it based on this assumption. David -- David Sheldon, Client Services DecisionSoft Ltd. Telephone: +44-1865-203192 http://www.decisionsoft.com
