AW: Schema annotations

Dr. Michael Treichel Tue, 28 Jan 2003 01:16:15 -0800

Hi!

I am working on it, but progress is slow. There are many other things in my
pipeline. I hope I have something by end of the week.


Best ... Michael

-----Ursprüngliche Nachricht-----
Von: Evert Hoff [mailto:[EMAIL PROTECTED]]
Gesendet: Dienstag, 28. Januar 2003 10:09
An: [EMAIL PROTECTED]
Betreff: Re: Schema annotations


Hi,

Is anyone busy working on implementing the annotations at the moment? If
so, I would appreciate a rough guess as to when it might be available in
CVS.

Thanks in advance,

Evert

On Thu, 2003-01-16 at 23:53, [EMAIL PROTECTED] wrote:
> Many thanks to Elena for the detailed message. Here I want to provide my
> view on one of the details.
>
> As Elena mentioned, currently the schema document is first parsed into a
> DOM representation (a DTM in our case) using a DOM parser, then the DTM
> tree is processed by various traversers (for various schema components):
>
> schema document --> (DOM parser) --> DTM --> (traversers) --> schema
> components
>
> If we following the same pattern, annotations would be process like the
> following:
>
> <annotation> elements --> (DOM parser) --> DTM element nodes -->
> (annotation traverser) --> XSAnnotation implementation
>
> Also as Elena mentioned, it might be a good idea to store annotations as
> Strings. (Storing the DOM/DTM would require more space, and might require
> storing the whole tree; while storing the string saves memory, but takes
> longer to retrieve information from it. There is always a trade-off.) So
> it'd be like:
>
> <annotation> elements --> DTM nodes --> Strings
>
> Wouldn't it be more efficient if, in the DOM parser, we directly serialize
> the parser events to a String, to save the time building (and GC'ing) the
> DTM nodes?
>
> <annotation> elements --> Strings
>
> Of course, to go this route, more things need to be taken care of. In the
> current XSDAbstractTraverser#traverseAnnotationDecl() method, some
checking
> needs to be performed:
> - The attributes on <annotation> need to be valid;
> - The names of child-elements need to be one of "appinfo" and
> "documentation";
> - The attributes on the above 2 elements need to be valid.
> But they can (easily) be checked in the DOM parser, instead of in a
> traverser.
>
> Advantage:
> - Performs better.
> - Solves problems regarding storing texts, comments and PI's.
>
> Disadvantage:
> - Might be (a little bit) harder to implement.
> - Need some code duplication (to check the validity of
> sub-elements/attributes mentioned above).
>
> So IMO, as a start, we can still use DTM (ignoring texts / comments /
> PI's). But in the long run, we might need to consider skipping it, and
> generate the strings directly.
>
> Thanks,
> Sandy Gao
> Software Developer, IBM Canada
> (1-905) 413-3255
> [EMAIL PROTECTED]
>
>
>
>
>                       Elena
>                       Litani/Toronto/IB        To:
[EMAIL PROTECTED]
>                       M@IBMCA                  cc:
>                                                Subject:  Re: Schema
annotations
>                       01/16/2003 03:49
>                       PM
>                       Please respond to
>                       xerces-j-dev

>
>
>
>
>
> Hi Michael,
>
> "Dr. Michael Treichel" wrote:
> > We would like help implementing access to schema annotations. How do we
> get
> > into business?
>
> In general, we thought that annotation should be stored as an XML string
> in each XML schema component (i.e
> org.apache.xerces.impl.xs.XSElementDecl), following the XML syntax
> defined in the XML Schema specification [1].
> The writeAnnotation() will invoke SAX or DOM parser that will parse this
> annotation string, issuing either SAX events or producing a DOM tree.
>
> Here some additional information:
> 1. Traversing of annotation from the schema is done in the
> org.apache.xerces.impl.xs.traversers XSDAbstractTraverser
> traverseAnnotationDecl. The schemas are traversed into dom-like
> structure (see implementation in org.apache.xerces.impl.xs.opti) but
> traversal is implementation independent (see DOMUtil class). You will
> need to walk the DTM tree (using DOMUtil) and create annotation string.
> At this time you will also need to save all the namespace declarations
> in scope (since you need those declarations to parse the string, etc...)
> -- XSDocumentInfo has a pointer to the namespaces in scope. You need to
> get all prefix declarations and add those as namespace declarations
> (xmlns:prefix="http://...";) to the xs:annotation.
> Btw, for serialization of the annotations you might look at
> XMLSerializer (in org.apache.xml.serialize) -- there is a serializeNode
> method, however I am not sure if it will work since it assumes DOM
> implementation.
>
> 2. The DTM (org.apache.xerces.impl.xs.opti) implementation currently
> does not create any text nodes, meaning that any text in <documentation>
> or <appinfo> won't be available... To fix it you need to modify the
> SchemaDOMParser (org.apache.xerces.impl.xs.opti) parser, to create
> string when in context of annotation element. This could be done by
> having a global StringBuffer, appending the data to it in characters()
> method and retrieving / resetting data at start/endElement calls
> (assuming there are no comments, PIs, etc in the annotation context).
> The string could be later stored either as a Text node or just a string
> field on the DefaultElement node (org.apache.xerces.impl.xs.opti).
>
> 2. In the writeAnnotation you will need to get a parser. For performance
> reasons you might need to implement some kind of synchronized parser
> pool to avoid creating a new parser for each writeAnnotation method
> invocation. As far as I know, Xerces grammar implementation is
> thread/safe (grammars could be added to different grammar pools used by
> parsers in the different threads). So you must synchronize access to
> your parser.
>
> 3. writeAnnotation allows you to serialize to DOM or SAX. Sax
> implementation is trivial -- you only need to set correct
> DocumentHandler on the SAXParser. For the DOM, you will need to make
> some changes to be able to pass Document object to the DOMParser. As it
> works now, AbstractDOMParser (org.apache.xerces.parsers) queries
> document-class property in startDocument(..) method and creates an
> appropriate document instance. You will need to modify this method to
> allow using pre-set Document class -- the class could be either set via
> a new public method or via new INTERNAL xerces property.
>
> Let us know if you have more questions.
>
> [1] http://www.w3.org/TR/xmlschema-1/#declare-annotation
>
>
> Hope it helps,
> --
> Elena Litani / IBM Toronto
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

AW: Schema annotations

Reply via email to