Dear Tim,
> Failing that, after the Document has been created, strip out
> all the Text Nodes, but I wrote a method to strip out Text
> Nodes, but it didn't work (if you call the method twice it
> finds Text Nodes each time), and I see in the archives that
What did you try? I did a quick experiment with:
NodeIterator iterator = ((DocumentTraversal)
doc).createNodeIterator(doc, NodeFilter.SHOW_TEXT, null, false);
Node nextNode = iterator.nextNode();
while (nextNode != null) {
// we know it's a text-node
Text textNode = (Text) nextNode;
nextNode = iterator.nextNode();
System.out.println("textNode.getNodeValue() = " +
textNode.getNodeValue());
if ((textNode.getNodeValue() == null) ||
textNode.getNodeValue().matches("^\\s*$")) {
System.out.println("removed");
textNode.getParentNode().removeChild(textNode);
}
}
iterator.detach();
And that seemed to do it. Not sure if this is legal (changing the DOM
during iteration?) and whether this would also filter out empty
attributes, but you could use it as a basis?
It seems to be smarter to do this that your preparser working on the
stream.
Kind regards,
--Sander.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]