Daniel Veillard wrote: >On Tue, Aug 30, 2005 at 10:07:10AM +0200, Ken Beesley wrote: > > >>The existing (hidden) Mac parser that parses XML specifications >>of input methods (into a low-level binary format) already >>handles  and other control characters now legal in XML 1.1 >>So this hidden Mac parser is XML 1.1-capable, at least as far as >>control characters are concerned. >> >> > > The real problem is that "parser" is from your initial description >not an XML-1.0 parser nor an XML-1.1 parser. Send some flames to Apple >for breaking a standard that everybody else tried to conform to. Then >work around that broken piece in their stack if you want but as always >for conformance problems workarounds it's just lost time in the long term. > > > First, I'd like to thank experts like Daniel Veillard, Uche Ogbuji and others who have responded to my XML 1.1 messages. I very much appreciate your volunteer work in creating and maintaining tools for XML processing.
Yes, as I pointed out in an earlier message, this Apple behavior is formally a no-no. It is of course the official duty of a respectable XML parser to refuse to parse a document marked version="1.0" if it contains character references like  that are legal only in XML 1.1. Apple is faultable here, but it should be understood that it's their own private HIDDEN parser, used for exactly one specific application: this hidden parser translates OS-X-input-method-defining XML files, defined by a DTD documented in http://developer.apple.com/technotes/tn2002/tn2056.html, into an even less human-friendly binary format that OS X really uses internally. This hidden parser has only one purpose in life; it's a dog that knows only one trick. This OS X input-method application naturally "needs" to refer to XML 1.1 characters; and Apple has apparently wired XML 1.1 assumptions into this hidden, one-trick parser. Their sin would be wiped away if they simply required that the input files be marked properly as version="1.1". But, again, that's not my "real problem". I need and want to validate and parse XML 1.1 documents containing character references that are legal only in XML 1.1. I'm willing and anxious to mark the files properly as version="1.1". I don't want to force XML 1.1 on anyone; but it's _exactly_ what I need for my application. There must be some other people out there with the same needs, in particular the people who went out of their way to write the XML 1.1 recommendation. The "real problem" or real nuisance for me is that so few of the open, general-purpose XML tools can handle XML 1.1 at all. Even if I mark my XML files properly as version="1.1", the tools can't handle them because they're limited to XML 1.0. Here's what I've found so far: The Jing validating parser, for Relax NG schemas, seems to validate XML 1.0 vs. XML 1.1 correctly. Nice. http://www.thaiopensource.com/relaxng/jing.html pxdom (http://www.doxdesk.com/software/py/pxdom.html) is a pure Python implementation of DOM, not dependent on Expat, and claims to handle XML 1.0 and XML 1.1 PyLTXML, from the Univ. of Edinburgh, also claims to handle XML 1.0 and XML 1.1. (http://www.ltg.ed.ac.uk/software/xml/) With pxdom or PyLTXML (still to be tested) it would appear that I can do what I need to do, using real XML 1.1, and not have to resort to any workarounds. I'd _prefer_ to use pulldom or perhaps Ogbuji's very attractive binderytools.pushbind(). If I were half as dedicated to XML 1.1 as Veillard and Ogbuji are to XML in general, I'd roll up my sleeves and contribute to the development rather than just begging. :) Thanks again to all those working on XML tools, Ken _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig