Re: Some random ideas around (broken) XML

Julian Reschke Wed, 18 Nov 2009 06:16:44 -0800

Karl Dubost wrote:

...
On Tue, 10 Nov 2009 22:47:52 GMT
In XML - Dive Into Python 3
At http://diveintopython3.org/xml.html#xml-custom-parser


Some people (myself included) believe that it was
a mistake for the inventors of XML to mandate
draconian error handling. Don’t get me wrong; I
can certainly see the allure of simplifying the
error handling rules. But in practice, the conceptS
of “wellformedness” is trickier than it sounds,
especially for XML documents (like Atom feeds)
that are published on the web and served over
HTTP. Despite the maturity of XML, which
standardized on draconian error handling in 1997,
surveys continually show a significant fraction of
Atom feeds on the web are plagued with
wellformedness errors.


Universal Feed Parser
http://www.feedparser.org/
...

The Universal Feed Parser is part of the problem. As far as I recall,the author was proposing non-draconian Atom parsing even before the Atomspec was done.

So what I'd like to see is data about the *current* state of *Atom*feeds, not RSS. My understanding (see also Sam's comment) is that theirare several popular consumers getting away with using proper XML parsers(except for the RFC3023 issue), which would indicate that the *actual*percentage of broken content is smaller than some people think it is.


BR, Julian

Re: Some random ideas around (broken) XML

Reply via email to