hi all,
        i've been looking into the dtd expansion issue with xindice/xerces.  i 
have
a few questions about the org.apache.xindice.xml.dom.DOMParser class.
1. why does this class use a SAXParser?  not that it's a huge deal, but it
just seems kinda strange to implement a class called DOMParser which
actually uses a SAXParser object to handle the parsing....  is this a common
implementation strategy?

2. i've discovered a fix that will prevent comments from being printed from
a DTD.  It involves changing 3 methods in the DOMParser class.

   public void startDTD(String name, String publicId, String systemId)
throws SAXException {
      this.inDTD = true;
   }

   public void endDTD() throws SAXException {
      this.inDTD = false;
   }

   public void comment(char ch[], int start, int length) throws SAXException
{
      if(!this.inDTD)
      {
          String s = new String(ch, start, length);
          context.appendChild(doc.createComment(s));
      }
   }

this will prevent comments from being appended to the DOM tree when the
parser is parsing a dtd.

However, I don't think that this actually solves the underlying problem.
Here's how I understand the goal... imagine this pseudo-code:
  String xmlPre = readFromFS("/tmp/my.xml");
  // {insert,get}Document from org.xmldatabases.xmlrpc.RPCOperations
  String id = insertDocument('/db/foo','bar',xmlPre);
  String xmlPost = getDocument('/db/foo','bar');

  // xmlPost should be exactly the same as xmlPre

The problem here is that anything like a DOCTYPE tag will be parsed and
resolved by the SAX parser (i think).  So the DOCTYPE declaration that was
in the xmlPre will *not* be in xmlPost because it disappeared in the
resolution of entities when the insertDocument code called:
        Document doc = DOMParser.toDocument( content );
That 'toDocument' call actually invokes the sax parser which will resolve
(and not insert) the doctype as an entity.

does this make sense to anyone else? or am i off my rocker....

thanks
dave

Reply via email to