Re: Real time Validation

neilg 8 Feb 2001 18:14:15 -0000


Hi Catherina,


To answer your second question first, yes, the original revalidation code
was only designed to work with DTD's.

Turning to your first question:  the situation is complicated by the fact
that Xerces tries to be as efficient as possible when validating a document
according to any given grammar.  In simple terms, everything about the
grammar--element/attribute names, URI's, scope etc.--is stored in integer
arrays. The "StringPool" structure is used to map all these string-values
to integers, and can be used to get the strings back out again.  The
structure of the grammar is built out of contentModels.  For efficiency,
there are various kinds of content models for various situations--mixed
content, element only content etc.  At their most complex (as would be the
case in your example) Xerces actually constructs a deterministic finite
automaton representing the possible configurations of a valid document.
The glue that binds these models and the integer arrays that contain
pointers, if you will, to the relevant strings are element declaration
indexes.  I should also note that attribute lists are handled a bit
differently--they're constructed as linked lists, though of course again
using integer arrays.

The above description isn't meant to be complete; I'm just hoping to give
you a very high-level overview of how Xerces 1 represents grammars (Xerces
2 has a much more humane way of doing these things).  All I hope is that
the above description isn't too intimidating...  Things really aren't that
bad when you start to dig down into the code a bit.

Hope that helps,
Neil

Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  416-448-3519, T/L 778-3519
E-mail:  [EMAIL PROTECTED]



Catharina Ibrahim <[EMAIL PROTECTED]> on 02/08/2001 08:16:51 AM

Please respond to [EMAIL PROTECTED]

To:   [EMAIL PROTECTED]
cc:
Subject:  Re: Real time Validation


hi,

> It is really the Grammar class which you want to
> look at, because a Grammar
> represents an internal (Xerces-j internal) way of
> representing an XML
> Structure. There are 2 type of Grammars:
> SchemaGrammar and DTDGrammar.
>

would like to know how Grammar represents XML Schema.
Is it in the form of tree with, for example, if I have
this kind of schema:

<xsd:element name="purchaseOrder"
type="PurchaseOrderType"/>

<xsd:complexType name="PurchaseOrderType">
   <xsd:sequence>
      <xsd:element name="shipTo" type="USAddress"/>
      <xsd:element name="items"  type="Items"/>
   </xsd:sequence>
   <xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>

the Grammar (if it is a tree data structure) will have
sequence as parent and element as child.. ? Or it
directly gives PurchaseOrder as parent and shipTo as
child ?

> (XML4J 2.1.15) there used
> to be a Revalidating
> DOM parser that you could use to do "write
> validation".
>

it's only with DTD if I'm not mistaken. I need to work
with XML Schema...

> Some other people in the past have approached this
> problem and written their
> solutions, you may be able to find some of those in
> the mail archives.
>

tried it, but didn't find any. Maybe I miss it. Or
maybe you mean other archives beside xerces-j-user?

> Hope this helps,
>

it does. thx a lot :)

cath

__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35
a year!  http://personal.mail.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Real time Validation

Reply via email to