I think that's correct. Non-validating parsers are allowed to process external entities; they're just not required to. As the XML spec (http://www.w3.org/TR/2004/REC-xml-20040204/#safe-behavior) makes clear, the behavior of different non-validating parser implementations is less predictable than that of validating parsers, precisely because they have fewer requirements to meet. Xerces reads external entities, which allows it detect well-formedness errors that it might otherwise miss and supply the same document information to the application that a validating parser would. In other words, Xerces' behavior is by design, and intended to be useful.
Presumably the DTD is in the documents you're processing because the author(s) believed its contents to be useful. If you can convince the author(s) that it's not useful, perhaps you can get it removed from the document(s). Otherwise, if you're certain it doesn't have anything you need, I think you'd have to write an entity resolver to avoid processing the (external subset of the) DTD. > -----Original Message----- > From: Michael Fuller [mailto:[EMAIL PROTECTED] > Sent: Friday, January 28, 2005 3:20 AM > To: xerces-c-dev@xml.apache.org > Subject: Re: How can I ignore DTD in an XML file > > On Thu, Jan 27, 2005 at 10:05:15AM +0000, Gareth Reakes wrote: > > XML Parsers still have to resolve the DTD for entities, even if you > > don't want to do validation. > > Why? If I want to validate using a W3C XML Schema, why do I have to > read the DTD? Are you suggesting that Xerces will read the external > DTD and use any entity definitions it finds when it is not doing > DTD-based validation? > > > Cheers, > > Gareth > > Michael > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]