Re: [Xerces2] Design Decisions (LONG)

Ted Leung Fri, 06 Jul 2001 20:53:12 -0700

----- Original Message -----
From: "Andy Clark" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, July 04, 2001 8:23 PM
Subject: Re: [Xerces2] Design Decisions (LONG)


> Elena Litani wrote:
> > Hmmm.. We do plan to implement DOM L3 specs and although we did not
> > discuss implementing Abstract Schema [AS] module (that allows editing
> > and querying grammars), the APIs are there. Do we want to provide
> > another way to access grammar??
>
> Are you confident that the DOM L3 ASLS is able to adequately
> model a grammar? I'm not. I'm sure it's sufficient (for some
> definition of "sufficient") for DTDs and XML Schema grammars
> but I'm not convinced it will be able to model other types
> of schema grammar languages.
>
> > I am not sure how DTDContentModelHandler is useful for editor writers.
>
> It's not supposed to be. And creating one that is useful
> causes the exponential interface explosion that we've seen
> from my various proposals. I'm not against providing some-
> thing useful for editor writers but that is not the primary
> purpose of the interface nor should it be.
>
> > In the world of XML Schemas, you better also provide support for editing
> > XML Schemas as well as DTDs, and DTDContentModelHandler provides way to
> > little information..
>
> That's why "DTD" is in the name of interface: because it's
> for DTDs, *not* XML Schemas.
>
> > My point is, I believe if we ever want to provide in Xerces APIs for
> > editing/querying AS (DTD, XML Schemas, other grammars), we better do it
> > by implementing standard DOM APIs.
>
> We "could" do it that way. Whether we "better" do it that
> way is open to debate.

It doesn't follow that because DOM (or SAX) have a particular API for
something that
we have to implement these API's directly in the parser.   As soon as we get
into
editing grammars, then we go back to the problem of how they are represented
inside
the parser.  That representation has to serve two purposes if we start to
talk about editing:

1) provide a high performance way to validate a document against a grammar
2) provide a rich enough representation that can be presented to an editor
and be pushed
back into the validator internals.

I think that these are probably 2 different representations.

> > > [2] Pass Base URI to startEntity Callback
> > > [...]
> >
> > Can baseSystemId be null?
>
> Yes, that would signify that there is no known base system
> identifier. However, I'm still trying to think of a situation
> where the systemId is *relative* but no base systemId is
> provided. Can anybody think of a use case?
>
> The point of passing this information is so that we don't
> munge data. The systemId parameter would be the exact
> identifier text that was specified in the declaration.
> In this way, the callee has access to all of the
> information. Without it we have two options: pass in only
> the systemId or pass in the expanded systemId. In the
> first case the callee doesn't have enough information to
> do something meaningful and in the second case the callee
> is not actually seeing what was defined in the decl.
>
> Yes, that information would be passed via the DTD handler
> (add that to my list of things that need to pass the base
> systemId) but then you're creating a dependency that you
> need to use both interfaces and keep state. The parser's
> already keeping the state, why not pass it along if we
> have already done the work, IMHO.
>
> > > [4] Remove Dependence on SAX
> >
> > If we are removing dependence on SAX, should we consider using DOM L3
> > APIs for error handling, entity resolver, input source?
>
> You're proposing that we remove one dependency to add
> another similar dependency. I'm trying to remove the
> dependency altogether, not just switch it to one that
> looks "better", even if that were the case.
>
> > Andy, you've mentioned that SAX raises a lot of problems, could you
> > expand on that? ..I am not sure why do we need to define again the same
> > interfaces that exist in both DOM and SAX??
>
> SAX and DOM are lossy. We're trying to avoid loss of
> document information by passing as much as possible
> (within reason). The SAX approach is better suited to
> building a parser because DOM cannot be used once the
> document size increases beyond size N, where N is
> some arbitrary large number. But even SAX can't be
> used directly because 1) it loses needed information
> in its handler interfaces; 2) the pipeline it creates
> is read-only (e.g. Attributes); etc.
>
> My proposal actually is more radical than what it
> first appears. I'm also suggesting that we take this
> opportunity to break backwards compatibility in order
> to do the right thing. Take, for example, parsing: the
> base XML parser class relies on SAX for setting
> features/properties, entity resolution, error handling,
> and parsing. Why would a DOM parser need such a
> dependency? No real reason.
>
> So what is the "right" thing? Depends on who you
> ask. If you ask the SAX people, they might say that you
> should use the parser factory helper classes to create
> the parser and then using the parser and handler
> interfaces defined. The DOM people might say that the
> right way is through the load/save component of DOM L3.
> Or they both might say using JAXP to create the parser
> and parsing documents. In none of these cases would I
> consider instantiating the parser class directly "the
> right thing to do".
>
> By defining our own fully-independent, internal API we
> remove all dependencies (because SAX and DOM are not the
> *only* way to output the document data from parsing an
> XML document) and improve the ability to layer parser
> components and configurations without carrying along
> unneeded dependencies. People using the SAX parser
> would do things the "SAX way"; people using the DOM
> parser would do things the "DOM way"; and none of them
> would have to do some things one way and do other
> things other ways just because we have built in a
> cross-dependency.
>
> Only people going *beyond* what is capable with the
> "standard" interfaces and taking advantage of the
> native APIs would ever know that there is a set of
> interfaces that look awfully similar to ones already
> present in SAX and DOM. Once the wrapper is built on
> top of the XNI stuff to expose it in the SAX way or
> the DOM way, people won't even know it's there.
>
> Can you tell I'm starting to feel pretty strongly about
> this point? ;)
>
> --
> Andy Clark * IBM, TRL - Japan * [EMAIL PROTECTED]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Re: [Xerces2] Design Decisions (LONG)

Reply via email to