Re: [ANNOUNCEMENT]: Xerces-J 2.6.0 now available

Arnaud Le Hors Sat, 22 Nov 2003 13:38:25 -0800

Oh I see, you're one of those who just don't want to hear about any new version of XML. But as you say it yourself, the idea that there is only one way to parse XML is already an illusion. The list of possible behaviors for a compliant XML 1.0 parser is already large enough that one cannot blindly rely on interoperability just because he uses XML. You have to deal with a whole range of variations from a minimally conformant non validating parser to a fully conformant validating parser, with or without namespaces support, with or without XML Schema validation, with or without XInclude support, and the list goes on... So the reality is that adding XML 1.1 to the mix doesn't make much difference. And ignoring the need for evolution is absurd.

Believe me, I'm the first to wish they got XML 1.0 right in the first place, but they didn't. And I don't blame them, that's life. Just be glad that those of us who fought to get XML 1.1 open enough so that we won't need to revise it again when a new version of Unicode comes out prevailed.

What I think we need is to define some kind of profiles that nails down the number of possible behaviors to just a few. Something that is named, that implementers can advertise as something they support, and that users can look for. Similar to what you get with the Java platforms labels J2EE, J2SE, J2ME, etc.

I was discussing this with Tim Berners-Lee this week and he agrees that it would be good. However, he thought that should be a new version of XML. Something I'd rather avoid, but I have to admit that it'd be good to be able to label your document somehow so that the processor could tell whether it can process it as you want it to be processed. This also brings us back into the processing model issue which has been in the air for a while. I don't see these problems to be solved any time soon unfortunately.

So, in the meantime, the best we can do is to work on making 1.1 omnipresent as quickly as possible so that we can put the burden of transitioning behind us. -- Arnaud Le Hors - IBM, XML Standards Strategy Group / W3C AC Rep.

Elliotte Rusty Harold wrote:

At 10:03 AM -0800 11/22/03, Arnaud Le Hors wrote:

I still don't understand how you think you increase interoperability by limiting a system in what it takes as input, but I guess it's ok...

There's no big secret here. It's *exactly* how XML increases interoperability by requiring draconian well-formedness checking. As the number of maybe right/maybe wrong formats increases interoperability decreases. The XML philosophy is fail as early and as fast as possible, and that works.

When I see XML 1.1, I want the parser to fail immediately and at the first opportunity. I don't want to let the document propagate through the system until it finally runs up against some component that can't handle it. I certainly don't want to accept a million documents needlessly labelled as XML 1.1, only to have the system fail unexpectedly in production when one of those documents uses a NEL or a Linear B tag name or something equally pointless. I want to debug it as near the source as possible, not a thousand miles and six months away.

Failing fast on any variation from a single syntax is one of the pillars of XML's interoperability. It's sad to see this lesson has not been learned or the principle adopted by so many purveyors of XML tools. In fact, I'd venture to say that most non-parser XML tools are far too accepting of bad data.

In any case, I have now added unit tests to my code to verify that version="1.1" (and 1.2, and 2.0) generate a fatal error; so I should notice if this bug slips in again in the future, either in Xerces or a different parser. I'm not surprised that all other parsers haven't been noticing this. I've certainly found a lot of bugs in them. Xerces 2.6 currently has one conformance bug that affects me <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24124>. The next best parser, the latest Oracle beta, has about ten. The remaining parsers written in Java all number in the dozens.

Sadly, XML's goal of being simple enough to be implemented by the desperate Perl hacker was not met. If after six years and the application of huge amounts of effort and brain power, we still don't have one parser that gets the basic spec right, then there's something wrong with the spec. Sadly most of these problems are not addressed by XML 1.1. :-(

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [ANNOUNCEMENT]: Xerces-J 2.6.0 now available

Reply via email to