hi all,
i've been looking into the dtd expansion issue with xindice/xerces. i
have
a few questions about the org.apache.xindice.xml.dom.DOMParser class.
1. why does this class use a SAXParser? not that it's a huge deal, but it
just seems kinda strange to implement a class called DOMParser which
actually uses a SAXParser object to handle the parsing.... is this a common
implementation strategy?
2. i've discovered a fix that will prevent comments from being printed from
a DTD. It involves changing 3 methods in the DOMParser class.
public void startDTD(String name, String publicId, String systemId)
throws SAXException {
this.inDTD = true;
}
public void endDTD() throws SAXException {
this.inDTD = false;
}
public void comment(char ch[], int start, int length) throws SAXException
{
if(!this.inDTD)
{
String s = new String(ch, start, length);
context.appendChild(doc.createComment(s));
}
}
this will prevent comments from being appended to the DOM tree when the
parser is parsing a dtd.
However, I don't think that this actually solves the underlying problem.
Here's how I understand the goal... imagine this pseudo-code:
String xmlPre = readFromFS("/tmp/my.xml");
// {insert,get}Document from org.xmldatabases.xmlrpc.RPCOperations
String id = insertDocument('/db/foo','bar',xmlPre);
String xmlPost = getDocument('/db/foo','bar');
// xmlPost should be exactly the same as xmlPre
The problem here is that anything like a DOCTYPE tag will be parsed and
resolved by the SAX parser (i think). So the DOCTYPE declaration that was
in the xmlPre will *not* be in xmlPost because it disappeared in the
resolution of entities when the insertDocument code called:
Document doc = DOMParser.toDocument( content );
That 'toDocument' call actually invokes the sax parser which will resolve
(and not insert) the doctype as an entity.
does this make sense to anyone else? or am i off my rocker....
thanks
dave