Uche Ogbuji wrote >------------------------------ > >Message: 2 >Date: Thu, 01 Sep 2005 11:59:09 -0600 >From: Uche Ogbuji <[EMAIL PROTECTED]> >Subject: Re: [XML-SIG] Corrected list of packages handling XML 1.1 >To: Walter D?rwald <[EMAIL PROTECTED]> >Cc: xml-sig@python.org, Ken Beesley <[EMAIL PROTECTED]> >Message-ID: <[EMAIL PROTECTED]> >Content-Type: text/plain; charset=ISO-8859-15 > >On Thu, 2005-09-01 at 12:50 +0200, Walter D?rwald wrote: > > >>Ken Beesley wrote: >> >> >> >>>My apologies to Fredrik Lundh of Pythonware for the omission of >>>ElementType+sgmlop in my recent listing of Python-XML packages that >>>handle XML 1.1. The list (that I'm aware of) currently includes: 1. >>>pxdom by Andrew Clover (http://www.doxdesk.com/software/py/pxdom.html, >>>http://www.doxdesk.com/file/software/py/pxdom.py) 2. pyLTXML from the >>>Univ. of Edinburgh (http://www.ltg.ed.ac.uk/software/xml, >>>http://www.ltg.ed.ac.uk/software/gpl_xml.html, >>>http://www.ltg.ed.ac.uk/software/xml/xmldoc/xmldoc.html) 3. elementtree >>>library from Pythonware (http://effbot.org/zone/element.htm, >>>http://effbot.org/zone/element-index.htm) If I've forgotten anyone, >>>please help me complete the list. >>> >>> >> > [...] >> >>XIST (http://www.livinglogic.de/Python/xist) handles XML 1.1 charrefs >>when a parser is used that does it. (XIST uses sgmlop by default, so it >>works by default). When serializing XML those charrefs are always >>supported. See the following snippet: >> >> >>> from ll.xist import parsers, presenters >> >>> from ll.xist.ns import html >> >>> e = parsers.parseString("<body>this is a backspace: </body>") >> >>> print e.asrepr(presenters.CodePresenter()) >>ll.xist.xsc.Frag( >> ll.xist.ns.html.body( >> 'this is a backspace: \x08' >> ) >>) >> >>> print e.asBytes() >><body>this is a backspace: </body> >> >> > >This conversation is really becoming surreal. People, please, it's very >simple: supporting the range of character references defined in XML 1.1. >Is not, repeat *NOT* the same thing as being an XML 1.1 parser. > >If I have software that parses "<a>b</a>" that does not mean I have an >XML 1.0 parser. If that software also accepts "<a>b</c>", then it is >obviously not such. > >Any software that accepts "<body>this is a backspace: </body>" >is neither a compliant XML 1.0 parser nor a compliant XML 1.1. parser. >All XML 1.1 documents *must have an XML declaration* according to the >strict stipulation of the spec. If an XML 1.1. parser encounters a >document without an XML declaration, it *must* assume that it is an XML >1.0 document, at which point it would *have to* stop with a fatal error >when it encounters . Period. There is no negotiation here. > >Therefore, as far as I can tell, neither the ET/sgmlop trick nor XIST >are XML 1.1. parsers. I cannot speak for LTXML or pxdom, but knowing >the authors, I would guess that they are indeed compliant XML 1.1 >parsers. > > > > What Mr. Ogbuji states about "being an XML 1.1 parser" and "being a compliant XML 1.0 parser [or] a compliant XML 1.1 parser" is of course correct. However, with respect, I believe that he misses the point and claims of the list.
I posted a list of packages "handling XML 1.1", and Martin Dörwald helpfully added XIST as a package that "handles XML 1.1 charrefs when a parser [like sgmlop] is used that does it". Neither one of us claimed that all the listed packages (and especially not the ones using an underlying sgmlop parser) were "XML 1.1 parsers". Perhaps my terminology is confusing, but what I meant by "handling XML 1.1" is this: "Handle XML 1.1" = able to process a valid XML 1.1 document without throwing up and quitting. Sgmlop (http://effbot.org/zone/sgmlop-index.htm) is admittedly non-validating and tolerant: "The *sgmlop* parser is tolerant, and happily accepts XML-like data that are not well-formed. If you need strictness, use another parser." In my own work, I do in fact use a second parser, separating the validation from the processing: 1. I prepare XML documents containing some control characters that are valid only in XML 1.1. I always mark the file <?xml version="1.1"?> 2. I then validate the documents using a Relax NG schema and the Jing validating parser, which knows the difference between XML-1.0-valid and XML-1.1-valid. 3. I then need to "handle" or "process" my already-known-to-be-XML-1.1-valid documents, to map them non-trivially into a different XML 1.1 language. Despite the fact that ElementTree+sgmlop or XIST+sgmlop cannot be "compliant XML 1.1 parsers", their ability to "handle" an already-known-to-be-XML-1.1-valid document is valuable to me, and perhaps to others who want to work with XML 1.1 documents. ****** That was the point of posting the list of "packages handling XML 1.1". If there's a better term than "handle XML 1.1", then please inform me, and I'll try to use it. Ken _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig