I find that the pre-caching of the grammars takes a very long time (but then
I do have something like 200 grammars totalling about 300k of text).  The
first real parse also takes a long time, but is about 5 times faster that
the grammar caching.  Subsequent parses are about 5 times faster again.  If
I clear the cache and discard the XMLReader and then start over then the
pre-caching is faster than the first time, and there's not a lot of
difference between the first and subsequent parses.  I put this down to the
hotspot compiler personally, and it doesn't particularly worry me since,
once everything is warm, it runs like shit off a shovel.

To summarise (timings in milliseconds):
First schema cache:  1760
First parse:          301
Second parse:          50
Third parse:           40
Fourth parse:          60

Discard everything and start again (but in the same VM):
First schema cache:   371
First parse:           70
Second parse:          40
Third parse:           40
Fourth parse:          50

When running it in a profiler, the profiler records no difference between
first and subsequent parses.

Cheers,
Huw


-----Original Message-----
From: Justin Robinson [mailto:[EMAIL PROTECTED]
Sent: 11 January 2005 22:22
To: [EMAIL PROTECTED]
Subject: Re: Problem caching grammars


But Huw,

Do you find that, if your parse the same document several times, the first
parse is always significantly slowly than the rest, or is it just me?!

Justin

----- Original Message -----
From: "Huw Roberts" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, January 11, 2005 11:05 AM
Subject: RE: Problem caching grammars


> Out of interest, I've also been trying to cache grammars recently (posted
on
> 2004/12/23).
> In the end I used the following code:
>
> System.setProperty("org.apache.xerces.xni.parser.XMLParserConfiguration",
>
"org.apache.xerces.parsers.XMLGrammarCachingConfiguration");
>         XMLGrammarCachingConfiguration xmlComponentManager = new
> XMLGrammarCachingConfiguration();
>         xmlComponentManager.clearGrammarPool();  // this will clear the
> grammar pool of the (about to be created) parser.
>         try {
>             mXmlReader =
> XMLReaderFactory.createXMLReader(XmlElementUtil.SAX_PARSER_CLASS);
>             mXmlReader.setFeature(XmlElementUtil.SCHEMA_FEATURE, true);
>
mXmlReader.setFeature("http://xml.org/sax/features/validation";,
> true);
>             mXmlReader.setProperty(XmlElementUtil.SCHEMA_LOCATION,
> uriSchemaLocation);
>             EntityResolver resolver = new
> XmlElementUtil.JavaResourceEntityResolver(mClassLoader);
>             mXmlReader.setEntityResolver(resolver);
>             InputSource inputSource = new InputSource(new
> StringReader("<CSLSystem/>"));
>             mXmlReader.setErrorHandler(new DefaultHandler());
>             mXmlReader.parse(inputSource);
>         }
>         catch (Exception ex) {
>             sCategory.error("Unexpected exception thrown during schema
> load", ex);
>         }
> After this code has been executed the cache is populated.
>
> Note that I only have a single no-namespace-schema so it doesn't matter
that
> my initial parse
> has minimal content.  Also I could probably discard the XMLReader
> (mXmlReader) rather than
> re-using it; I don't think it will make any difference to the cache.
> Finally note that
> if you throw an exception during the first parse, then the cache
> (GrammarPool) may not
> be populated.
>
>
> -----Original Message-----
> From: Justin Robinson [mailto:[EMAIL PROTECTED]
> Sent: 08 January 2005 18:50
> To: [EMAIL PROTECTED]
> Subject: Re: Problem caching grammars
>
>
> A quick breakpoint shows the validator attempts to retrieve only the
grammar
> that I've put in. So the caching doesn't seem to be the problem. I'll look
> again and see if I can find a hold-up in configurePipeline().
>
>
> ----- Original Message -----
> From: "Bob Foster" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Saturday, January 08, 2005 5:54 PM
> Subject: Re: Problem caching grammars
>
>
> > Justin Robinson wrote:
> >
> > > Thanks Chris,
> > >
> > > I'll take a look there. Do you know if Xerces actually tries to access
> > > namespaces/schemas from off the net?
> >
> > Not a namespace; a namespace is just a string and can't be 'accessed'.
> > But it will get a schema off the net if it has a location on the net.
> > Wouldn't work otherwise.
> >
> > > I'm wondering if, initially, my pool is
> > > missing a grammar and on the first parse it's actually caching the
> grammar
> > > I've missed.
> >
> > Sounds likely. You can set a breakpoint in retrieveGrammar() to see if
> > you get called for a schema you aren't expecting.
> >
> > Bob Foster
> >
> > >
> > > Justin
> > >
> > > ----- Original Message -----
> > > From: "Christopher Ebert" <[EMAIL PROTECTED]>
> > > To: <[EMAIL PROTECTED]>
> > > Sent: Thursday, January 06, 2005 8:41 PM
> > > Subject: RE: Problem caching grammars
> > >
> > >
> > >
> > > Just a guess here, but since it's the first parse, I'd suspect
> > > configurePipeline() or some other initialization step. You might look
at
> > > the configuration mechanism and see if there's a way to streamline it;
> > > this would probably mean looking at the way the configuration is
> > > determined and setting parameters for the first pathway that's checked
> > > so it gets the configuration you want.
> > >
> > > HTH
> > >
> > > Chris
> > >
> > > -----Original Message-----
> > > From: Justin Robinson [mailto:[EMAIL PROTECTED]
> > > Sent: Thursday, January 06, 2005 12:34
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: Problem caching grammars
> > >
> > > Any thoughts on this?
> > >
> > > ----- Original Message -----
> > > From: "Justin Robinson" <[EMAIL PROTECTED]>
> > > To: <[EMAIL PROTECTED]>
> > > Sent: Sunday, January 02, 2005 2:34 PM
> > > Subject: Problem caching grammars
> > >
> > >
> > >
> > >>Hi there....
> > >>
> > >>I have managed to preparse my XML Schema and have put it in a grammar
> > >
> > > pool,
> > >
> > >>according to the active caching approach descrbied at
> > >>http://xml.apache.org/xerces2-j/faq-grammars.html#faq-1
> > >>
> > >>I'm expecting the time taken to set up my SAX parser to increase,
> > >
> > > which it
> > >
> > >>does, so that's fine.
> > >>I'm also expecting the time taken on the first parse call to decrease.
> > >>
> > >>This is where my problem is. The first parse still takes an average of
> > >
> > > about
> > >
> > >>7 times longer than subsequent parses.
> > >>
> > >>What else must I do to bring down the time taken for the first parse??
> > >>
> > >>I tried to look at the source code, but I'm having trouble locating
> > >
> > > where
> > >
> > >>the time might be taken up (still learning how to debug). The path of
> > >>execution goes through these classes:
> > >>
> > >>1. AbstractSAXParser
> > >>2. XMLParser
> > >>3. XML11Configuration (methods parse() and configurePipeline())
> > >>
> > >>Any ideas?
> > >>
> > >>Here's how I set up the grammar pool:
> > >>
> > >>   private XMLGrammarPool getGrammarPool() throws IOException {
> > >>      // create grammar preparser
> > >>
> > >>      XMLGrammarPreparser preparser = new XMLGrammarPreparser();
> > >>
> > >>      // register a specialized default pre-parser
> > >>      preparser.registerPreparser(XMLGrammarDescription.XML_SCHEMA,
> > >
> > > null);
> > >
> > >>      // create grammar pool
> > >>      XMLGrammarPool grammarPool = new XMLGrammarPoolImpl();
> > >>
> > >>      // set the grammar pool on the grammar preparser
> > >>      // so that all the compiled grammars are automatically
> > >>      // placed to the grammar pool
> > >>
> > >>
> > >
> > >
preparser.setProperty("http://apache.org/xml/properties/internal/grammar
> > > -poo
> > >
> > >>l", grammarPool);
> > >>
> > >>      // parse grammar(s). They are automatically added to the pool,
> > >>beacause of the above
> > >>      // property that has been set.
> > >>      preparser.setFeature("http://xml.org/sax/features/namespaces";,
> > >
> > > true);
> > >
> > >>      preparser.setFeature("http://xml.org/sax/features/validation";,
> > >
> > > true);
> > >
> > >>
> > >
preparser.setFeature("http://apache.org/xml/features/validation/schema";,
> > >
> > >>true);
> > >>
> > >>
> > >
> > >
preparser.setFeature("http://apache.org/xml/features/validation/schema-f
> > > ull-
> > >
> > >>checking", true);
> > >>
> > >>      Grammar g = preparser.preparseGrammar(
> > >>          XMLGrammarDescription.XML_SCHEMA, new XMLInputSource(null,
> > >>
> > >>
> > >
> > >
"c:\\jdev\\workspace\\UncleJustWiki\\xmlschemas\\DraftRevisionSchema.xsd
> > > ",
> > >
> > >>                null));
> > >>
> > >>      // lock grammar pool. Don't add any more grammars
> > >>      grammarPool.lockPool();
> > >>      return grammarPool;
> > >>   }
> > >>
> > >>Regards,
> > >>Justin
> > >>
> > >>
> > >>
> > >>---------------------------------------------------------------------
> > >>To unsubscribe, e-mail: [EMAIL PROTECTED]
> > >>For additional commands, e-mail: [EMAIL PROTECTED]
> > >>
> > >>
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> > >
> > >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to