Steven Knight <[EMAIL PROTECTED]> wrote:
When I try to process an individual fragment without the declarations, I of course get "Undeclared entity" fatal errors for these entities, which terminates parsing of the fragment.
Yep. I think this comes from expat itself, so might not be avoidable without using/subclassing a different parser, eg. xmlproc.
If you aren't tied to SAX, the pxdom parser will cope with undeclared entities. (If you use a DOM Level 3 ErrorHandler it'll receive a DOMError 'pxdom-unbound-entity' with severity WARNING.)
Having to declare a DTD just to be able to perform some specific preprocessing on an otherwise well-formed fragment of XML feels way too heavyweight to me.
For me too - pxdom's lenient behaviour was originally intended to allow PXTL to pass entities through without having to define them in a separate doctype for 'target doctype plus PXTL'.
Other Python DOM tools don't really support the idea of keeping hold of unexpanded entity references, so they can't do anything but complain if they get an undeclared one.
However note that in certain circumstances (in summary: when the parser can know that there are no unprocessed DTD declarations) undeclared entities are a well-formedness error instead of a validity error. So technically your document might not be well-formed, which would be a Bad Thing.
In summary, entities suck, make everything harder, and should have been left out of XML completely. </controversy>
-- Andrew Clover mailto:[EMAIL PROTECTED] http://www.doxdesk.com/
_______________________________________________ XML-SIG maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/xml-sig