The latest Xerces code supports revalidation of a DOM tree against XML
Schema grammars.
The code could be downloaded at:
http://gump.covalent.net/jars/latest/xml-xerces2/

The support is available via DOM L3 implementation of
normalizeDocument() [1].

Comments/feedback/bug-reports are VERY welcome. :)


IMPORTANT!! 
-----------
The code is experimental: the methods/classes could be removed,
modified, or renamed. 

In particular, in the latest code the DOM L3 functionality is accessible
via org.apache.xerces.dom.DocumentImpl. 
However, we want to try to reorganize Xerces DOM implementation code to
separate DOM L2 from DOM L3 implementations. 
Thus, in the future DOM L3 functionality could be moved to another
class.

Parsing documents
-----------------
Use DOMBuilder parser instead of DOMParser if you plan to revalidate a
document.
You can create the DOMBuilder as follows:
(a) new org.apache.xerces.parser.DOMBuilderImpl();
(b) call org.apache.xerces.dom.DOMImplementationImpl.createDOMBuilder()


How to tell DOM implementation to revalidate the tree
------------------------------------------------------
DOM L3 provides setNormalizationFeature()[2] on the Document interface
that allows users to specify what functions normalizeDocument() should
perform. 
(1) Cast Document to DocumentImpl.
(2) Call document.setNormalizationFeature("validate", true).
(3) To start (re)validation call document.normalizeDocument().

How to specify grammar for a document
-------------------------------------
(1) The documentElement must have xsi:schemaLocation or
xsi:noSchemaLocation attributes that specify schema location(s).
(2) The documentURI [3] must be set. The location of the schema
documents will be resolved relative to documentURI.

How to register error handler
------------------------------
Use DOM L3 setErrorHandler() [4] method to attach error handler to the
Document. 
You need to cast to DocumentImpl to be able to call this method.

Limitations
-----------
- Revalidation of the DOM tree against DTD grammar is not supported.
- EntityRefence, CDATASection content will not be validated. 
- Schema normalized values won't be exposed via the tree after DOM
revalidation.
  The element default value (the one that is added by XML Schema
validator) won't be exposed via DOM tree.
- Attribute value normalization - the code does not normalize attribute
values per XML 1.0 (type CDATA). 
That means that XML Schema validator may not be normalizing attribute
values correctly.

Thanks,

[1]
http://www.w3.org/TR/2002/WD-DOM-Level-3-Core-20020409/core.html#Document3-normalizeDocument
[2] 
http://www.w3.org/TR/2002/WD-DOM-Level-3-Core-20020409/core.html#Document3-setNormalizationFeature
[3]
http://www.w3.org/TR/2002/WD-DOM-Level-3-Core-20020409/core.html#Document3-documentURI

-- 
Elena Litani / IBM Toronto

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to