Hi, Let me inject some additional input from a Java perspective:
At the moment, the Java validation architeture and the C++ validation architecture are a little bit out of sync, but not in a huge way. The Java validator also directly talks to the scanner and application level API's are presented with their data by the validator. This allows us to use one validator to provide validation for SAX and DOM. We are discussing modifications to the Java validator that would make it a "regular" event handler, which would correspond roughly to the notion of document handler in the old Java implementation and the current C++ implementation. My opinion is that there is a bunch of stuff which is up in the air, and which we should be considering / re-considering. I also think that this mailing list is the right place for this discussion. There are a couple of issues being tossed around in this thread. 1. Is it possible to make the validator a standalone thing outside of the parser? This includes the question of revalidation. Paul and Scott are interested in this for different reasons. Paul wants to plug a validator in at various points in a SAX-connected process. I'd like to understand why he wants to do this. If we proceed with making the validator a regular document handler, we are part of the way to what I think Paul wants. Today the Java validator still relies on pools which are off someplace out. One consequence of this is that it is hard to make a DTD/Schema cache. By the time we build that cache, we should have an "object" which will have all the info it needs (predigested) to process a Schema or DTD. To get what I think Paul wants then involves some API on the validator that translates between pool id's and strings. Scott is interested in being able to validate a DOM tree without writing it out and re-reading it. In the 2.0.x version of XML4J we had a RevalidatingDOMParser which would allow you to change a DOM tree and call the validator to make sure that the changed tree was still valid. We haven't ported this functionality forward to Xerces because we want to do it in a way that also supports Schema. This is a goal for us. 2. What is the plan for accessing information in a type-safe way (i.e. how do I get the value of a DOM node as a float?). At the moment, the schema efforts are focusing on validity checking -- does a piece of text follow the structural rules set out by Schema? We know that we are going to have to present an application with typed data. This will involve the creation of another API which talks to the internals of the validator. The type-aware versions of SAX and DOM will then talk to this API to get typed information. At the moment, our thinking is to integrate the type-safe API with the validation code, since the code that converts data from strings to floats will have to do similar work to the code that checks that a string conforms to float syntax. I do see the value of having the access API not depend on the validator, but it seems to me that the information which the access API needs to do its job is a decent fraction of the information that a validator needs to construct. Ted
