Re: grammar caching requirements

Andy Clark Wed, 06 Feb 2002 19:07:22 -0800

[EMAIL PROTECTED] wrote:
> Now that we've got a framework for grammar caching, it seem slike a
> pretty good time to start a discussion of what our default
> implementation should look like.  We'll also need to make sure that all


Since we're starting the discussion about implementing the
generic grammar caching mechanism, I went back to the grammar
caching framework that we argued so much over. And I would 
like to quickly revisit this subject before continuing.

The fundamental question is how grammars are to be identified
by validation components and the grammar pool. Currently, we
have a grammar type such as "DTD" and "XSD"; and we have a
description defined by the XMLGrammarDescription interface.
Do we need both?

An application may preload the cache with DTD and/or XML
Schema grammars. The parser, when it parses the document,
then requests a grammar by specifying a grammar description.
For a DTD, this would be the rootElementName, publicId, and
systemId specified in the DOCTYPE declaration, for example.

So I think we *should* keep this information linked: grammars
and grammar descriptions. Perhaps something like the following:

  public interface Grammar {
    public XMLGrammarDescription getGrammarDescription();
  }

  public interface XMLGrammarDescription
    extends XMLResourceIdentifier {
    public String getGrammarType();
  }

The XMLGrammarPool interface would stay the same.

I suggest this change for two reasons: 1) having methods to 
identify the grammar type on *both* the grammar and grammar 
description interfaces seems superfluous; and 2) I think
that it would be simpler in the end to keep the grammar
description information with the grammar.

If a grammar doesn't keep a copy of its associated grammar
description then each grammar pool instance needs to have
all of the logic to determine if a requested grammar
description properly identifies a registered grammar. 

Does this make sense?

> 3.  It should encompass both XML Schemas and DTD's;

And more...

> 4.  It should permit grammars to be preparsed or cached as they are
>     encountered while validating instance documents;
> 5.  It should permit the application to "lock" the cache, that is,
>     prevent any more grammars from being added.

And we need to be able to allow a DTD grammar to a) be 
used in the case where the document contains no DOCTYPE 
line and b) override the grammar specified in the DOCTYPE
declaration.

> I suspect it's not realistic to permit DTD preparsing yet.  But I do

It's not the DTD preparsing that I think is difficult. I'm
more concerned about how to handle internal subsets with a
cached grammar.

> I looked over the CachingParserPool that we currently have (has anyone
> used it?) but I can't see any way of adapting this sort of approach to

The caching parser pool was never used (and could not be
used) because we didn't have a grammar caching facility.
But it was written as a placeholder with an idea towards
the future where we could support grammar caching.

-- 
Andy Clark * [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: grammar caching requirements

Reply via email to