PROGRESS: I've got a simple XNI (re)validation pipeline running, based on Elena's DOM revalidator. The basic architecture is:
1) Instantiate an XNI parser configuration. I've swiped her DOMValidationConfiguration.
As I've noted elsewhere, I'd really like to see a more generic build-your-own-pipeline
configuration supplied as an off-the-shelf convenience class...
2) Set the configuration's validation features
3) Instantiate the validator (an explicit instance of ...xerces.impl.xs.XMLSchemaValidator)
4) Tell the configuration to add the validator as a component.
5) Give the validator the appropriate base URI.
6) Set the validator as the documentHandler for my XNI serializer (which implements DocumentScanner)
7) Set my XNI de-serializer as the documentHandler for the validator, and as the errorHandler for the configuration.
8) Tell the configuration to reset() itself (resets the components too, including the validator)
9) Tell my serializer to start sending XNI events, via scanDocument(true)
Architecturally, I don't much like the fact that so much has to be done explicitly to the validator and serializer, rather than handing those to the configuration and thereafter talking only to the configuration as if it was like any standard parser configuration (or composite Component?)... either I've missed a feature or three, or there's room for some API polishing here. Or both.
But this definitely seems to be working, as far as it goes.
QUESTION: Actually, in my case what I want to do is validate against a set of schemas specified by the application, _NOT_ by directives in the XNI stream. (I may actually be forced to filter xsi: directives out of the stream going to the validator, unless there's some way to tell XMLSchemaValidator to ignore them.)
As I understand it, this should be possible by pre-constructing a GrammarPool and passing that into the configuration before we do the reset(). But I haven't yet been able to make this work.
What I'm currently doing is
Create a ...impl.xs.XMLSchemaLoader
Create a ...util.XMLGrammarPoolImpl
Tell the Loader to read from the specified public/system/base-URI location
(seems to be working)
Adding the resulting Grammar to the GrammarPool
Then, between steps 5 and 6 above:
configuration.setProperty(GRAMMAR_POOL,,myGrammarPool)
This doesn't seem to be working. I don't know whether this is because the schema loader and/or grammar pool need to share more data with the rest of the configuration, or because the data stream does contain xsi:schemaLocation directives and they're overwriting the grammar I tried to load, or because I've made some other foolish mistake... any insights into what I'm doing wrong would be welcome.
Note that my grammar loading code is currently running LONG before the configuration gets built, because I may want to use it across multiple validations. I might be able to move more of the pipeline construction back to that point, if it's necessary.
That brings up another question: Are Grammar Pools reentrant (if locked?)? I presume Configurations as a whole are _not_, given that they've got several kinds of stored data...I may get several requests to validate different documents at once, and I need to know how paranoid I should be to prevent cross-talk between them.
______________________________________
Joe Kesselman / IBM Research
- Re: Revalidation: progress, and some additional questions Joseph Kesselman
- Re: Revalidation: progress, and some additional ques... neilg
- Re: Revalidation: progress, and some additional ques... Joseph Kesselman
- Re: Revalidation: progress, and some additional ques... Joseph Kesselman
- Re: Revalidation: progress, and some additional ques... neilg
- Re: Revalidation: progress, and some additional ques... neilg
- Re: Revalidation: progress, and some additional ques... Joseph Kesselman
- Re: Revalidation: progress, and some additional ques... Joseph Kesselman
- Re: Revalidation: progress, and some additional ques... Joseph Kesselman
