Re: [Discuss] Pull Parsing, JSR-173, and Xerces

Jeremy Carroll Tue, 07 Oct 2003 02:56:07 -0700

I am afraid I do not know anything about JSR-173, but I am a user of the Xerces pull parser, and I like the very simple parse-some API offered.

I offer this brief system description for my RDF/XML parser (which is the one used by the W3C RDF Validator).

My code consists of two parsers (an XML parser and an RDF parser) that conceptually act as coroutines. I initially coded them as two threads - the XML parser based aound the SAX interface. The relevant SAX events gave rise to a sequence of events (defined by me) which were the input to the second parser.

The original implementation used two threads for the two coroutines.

Using the Xerces pull parser, I have inverted the XML parser, and made it a subroutine to the RDF parser. This needed minimal code changes, and leaves a clear conceptual coroutine design, all running in a single thread.

Advantages of the Xerces design are:
- the events being pulled are defined by the user rather than the standard

Disadvantges of the Xerces design:
- the user has to manage the event buffer

A particular issue in my implementation is that I turn attribute value pairs into events, and they are placed in the buffer subject to order constraints defined by me - in particular the rdf: attributes come before other attributes. Clearly this would not be appropriate for most users.

A second issue is error handling - I turn all errors into error events and place them in the event buffer. This means they are reported at the appropriate point in the second round of processing.

Hope this helps.

My ideal is that a pull parsing standard should be close to the current Xerces code.

Jeremy


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [Discuss] Pull Parsing, JSR-173, and Xerces

Reply via email to