Hi all,
Elena and I were just looking over the proposed final draft of JSR 173,
which is apparently up for final approval
ballot voting in little more than a week. A few things jumped out rather
forcefully at us, and we were wondering
whether anyone else had had similar reactions.
Firstly, although we can--with some creativity--imagine applications that
would use the event portion of the spec,
it seems that there is a very natural layering which the spec does not
exploit. That is, the much more
broadly-applicable streaming portion does not depend on the events portion
in any logical way, but is inextricably
intermingled with it. It seems to me that this is tantamount to making DOM
level 3 Core depend on DOM events, or
having the base SAX interfaces depend on the SAX extensions. We are
concerned that fewer people might want to
implement this spec because of this dependence on events than would
otherwise be the case if there were a basic
"core" supporting streaming, and a module describing the events that was
optional and separate.
It also seems to us that the overall quality of the spec seems somewhat
dubious. It's easy to find examples where
methods are underspecified, javadocs not well-written, and areas that don't
appear to have gained from the lessons
learned by other XML parsing API's. Some examples are:
* Resetting: It is not possible to reset XMLStreamReader. For performance
reasons, an application should be allowed to reset the reader instead of
being required to create a new reader on each parse.
* javax.xml.stream.isCoalescing: This feature should be optional -- not
every implementation will want to support it
for performance reasons. In addition, this kind of a feature might expose
new security holes in implementations.
* EntityResolver: SAX, DOM L3, XNI entity resolvers specify that publicId
can be passed to the application. However
STAX chooses only to pass uri, and it is not clear how would it be possible
to resolve DTD external entities
referenced by publicId or to support XML Catalogs. It is unclear how XML
Schemas resolution could be supported by this
API.
* Surprisingly, there is no way to skip element children. This requirement
has been known for a long time and it
feels odd that this is not part of the API.
* Validation: while other specifications like JAXP specify the schema type
against which validation should occur, STAX
fails to specify what should happen when isValidating (XMLInputFactory) is
set to true.
* Character events: the spec fails to clarify at what point character
events should be reported -- is it when a
parser points the character data or when the parser has already read the
character data? If the former, then it is
not clear how isWhitespace method can work given that parser will have to
read characters to determine if those are
white spaces. If later, then it is not clear why the spec has "int
getTextCharacters(...)" since it seems like this
method does not add any functionality but forces a parser to copy the
characters into user specified array.
* isWhitespace: spec fails to specify if what is "white space" is so-called
"ignorable whitespace" or XML 1.0/1.1
white space.
* getText* methods: The table ?Valid methods for each state? states that
getText* methods are valid for the
CHARACTERS, CDATA, SPACE, COMMENT events. However, descriptions for text
methods are inconsistent -- sometimes
descriptions only mention a subset of events which makes someone wonder
which part of the specification has a typo. At
other times, the description doesn't mention any events at all.
* Namespaces: the Java docs stay silent on what happen when namespaces are
turned off.
It's obvious there's a lot of use-cases for a pull-parsing API; we are just
curious to see whether anyone else out
there is of the view that what's currently being proposed here might not be
well-structured or mature enough to
meet the need...
Cheers,
Neil, Elena
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone: 905-413-3519, T/L 969-3519
E-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]