Thanks for the excellent explanation, Glenn. Clearly the issue of what to do when *parsing* an xml document which has standalone=yes is not trivial, and the results will be parser-dependent, and parser-configuration-dependent. It is very useful to know this. And very interesting to know that "validation=false" in Xerces really just suppresses error reporting, rather than making Xerces a non-validating parser. That explains a few things...
However the issue I face is *serialization* of xml. Basically, I need to read xml generated by customer A, transform it using xslt and serialize it into a format as specified by customer B. Now, if customer B says that they want standalone=yes and a doctype with a public or system ID, there is currently *no way* to generate this output using the XMLSerializer class. If there is a public or system ID, then any standalone attribute I specify in the OutputFormat is ignored. What I would need to do is post-process the output (sed??) to add the standalone attribute! What I was wondering when I initially raised this issue, was whether in fact there was something in the xml spec which meant this combination was invalid xml. What I understand from your explanation is that specifying standalone='yes' is generally a pretty bad/dangerous idea when there is an external DTD but it is not actually forbidden by the XML standard. Therefore, I think that it should be possible. And I would like to note that in my situation, there is no guaruntee that the XML I generate from Xerces is going to be parsed by Xerces... This isn't a major or urgent issue for me; no customer has yet asked for this combination. I was working on improving serialization in my application, and spotted the standalone issue as a possible future problem for me rather than a current one. To summarize: I think the bug raised is actually valid. While inadvisable, it is _legal_ for an xml document to have standalone=yes and a public or system ID in the DOCTYPE, and therefore the XMLSerializer class should not prevent the user (me) from generating such output if they really want it. Questions: Would changing this behaviour (automatic suppression of standalone=yes) break any existing code out there? If not, then I would be happy to submit a patch. It will be pretty trivial [the kind of patch I like]. I would certainly include advice in the javadoc for method setStandalone advising against using it. Or we could just close the bug & forget about it; I'm not deeply emotionally attached to this issue ;-) Other notes: Glenn's response raises the issue of whether some kind of Xerces configuration item (settable feature) is needed to tell Xerces to skip reading of external entities when standalone=yes, but to read them when standalone=no. I like this idea. This is the behaviour I (naively) expected to occur by default, but having it as an optional feature is reasonable, with the default behaviour instead being the "safest" option of processing external entities anyway. Regards, Simon --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
