Neil and Elena make very good points that should be fixed before this specification goes any further. But I wonder if the points that were raised are indicative of a deeper problem.
Firstly, StAX looks too much like XMLPULL which I've had reservations about in the past. But it bears repeating...
I think that the API is too big. All of the methods (and there are a lot of them) are placed on a single interface. It reminds me a lot of the DOM Node interface which has methods which are dependent on the node type. A proper layering of interfaces could help this situation a lot.
But looking deeper, I wonder if an entirely different approach is needed. I like the idea behind pull-parsing APIs but StAX and XMLPULL aren't the kind of API I want to use to write applications. Don't get me wrong -- it works. But so does XML Schema...
In addition, I have concerns about implementing StAX in Xerces. We designed XNI as a streaming model similar to SAX. This enables the model to be extremely flexible in the way that parser configurations are assembled. Pull APIs turn this model on its end which causes problems.
The major problem is that the mismatch of APIs requires us to buffer document events moving through the pipeline. This degrades performance. Some of this can be regained by adjusting the character buffering code, though.
In order to implement a pull-parsing API efficiently, we would need to change the framework completely and re-implement all of the components. But doing this makes it hard to support the flexibility and modularity that we take for granted with the current XNI.
I think it would be wise if we had a detailed discussion about the impact of StAX on Xerces before we commit it implementing the specification.
What do other people have to say? What is good about StAX? What is bad? All comments welcome.
-- Andy Clark * [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]