Daniel Shane wrote:
> Is the pull parsing in Xerces validating also? If so I think it would 
> make Xerces the first validating pull parsing parser available :)

The standard parser configuration supports DTD and XML
Schema validation and can be used as a pull parser.
However...

The Xerces2 concept of "pull parsing" is a little different
than what is commonly referred to as pull-parsing. In the
Xerces view, it's a way of allowing the application to parse
a document in "meaningful chunks". Basically, any Xerces2
parser configuration that implements the XMLPullParser-
Configuration interface can break at reasonable places during
the parse. However, what is "reasonable" is left up to the
implementing parser configuration.

In Xerces2 pull-parsing terms, the parsed information is
still delivered to the application via the normal SAX-like
callbacks. This is different than the recent pull-parsing
efforts like XPP[1] which are more similar to lex in that the
parser is driven by the application and then the information
is requested from the parser instead of being pushed to the
registered handlers. Make sense?

However, I've recently written a buffered pull parser
configuration for Xerces2 that guarantees that one and only
one callback is sent for each call to "parse". It does this
by buffering events as they are received from the underlying
parser configuration. There are two nice things about this:
1) the application knows that only a single even will be
delivered at a time and can act appropriately; and 2) any
XNI parser configuration that does *not* implement the
XMLPullParserConfiguration interface can be used as if it
*did* implement the interface. (The latter means that the
entire document would be buffered but this is fine if
dealing with small to medium sized documents.)

<aside>
Incidentally, I have noticed on my own system that there
is no significant performance reduction over the standard
parser configuration when the events are buffered. This
seems to hold even when the document is content-heavy
(e.g. ot.xml).
</aside>

I intend to release my buffered parser configuration as
part of the CyberNeko Tools for XNI[2] package within the
next couple weeks. With it, I plan to include a true
pull-parsing API. Granted, this API will just be for
experimentation but it will show what can be done using
the XNI framework.

But I'm getting away from the point...

I mentioned XPP earlier. I believe it includes an
implementation that is driven by Xerces2. However, I
don't know if it buffers the underlying events properly.
But if that API is useful for you, then that's another
way to get validation using a pull-parsing API.

[1] http://www.xmlpull.org/
[2] http://www.apache.org/~andyc/neko/

-- 
Andy Clark * [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to