DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13897>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13897 Reuse parser and cache XML schema in XalanC Summary: Reuse parser and cache XML schema in XalanC Product: XalanC Version: 1.4.x Platform: All OS/Version: All Status: NEW Severity: Enhancement Priority: Other Component: XalanC AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] It would be nice to expose in XalanC the latest Xerces features to cache analyzed schema to be reused accross multiple parsing/validation. It would also mean the reuse of the same parser instance for multiple XSLT processing in XalanC and even within specific XSLT function such as the document() one. Here is a short version of an email exchange in the mailing list describing the issue with more details as well as providing a "workaround" to do it. -----Original Message----- From: David N Bertoni/Cambridge/IBM [mailto:david_n_bertoni@;us.ibm.com] Sent: Tuesday, October 22, 2002 4:35 PM To: [EMAIL PROTECTED] Subject: RE: Schema validation performance Hi Thomas, You can use Xerces to parse a document without switching to the internal interfaces. Here's some pseudo-code, which I haven't tested, but which should give you an idea of what you need to do: void parse( const InputSource& theInputSource, XalanCompiledStylesheet* theStylesheet, const XSLTResultTarget& theResultTarget) { SAX2XMLReader* const theReader = XMLReaderFactory::createXMLReader(); XalanTransformer theTransformer; XalanDocumentBuilder* const theBuilder = theTransformer.createDocumentBuilder(); theReader->setContentHandler(theBuilder.getContentHandler()); theReader->setLexicalHandler(theBuilder.getLexicalHandler()); theReader->setDTDHandler(theBuilder.getDTDHandler()); const XalanDOMString reuseGrammar("http://apache.org/xml/features/validation/reuse-grammar"); const XalanDOMString namespacePrefixes("http://xml.org/sax/features/namespace-prefixes"); theReader->setFeature(reuseGrammar.c_str(), true); theReader->setFeature(namespacePrefixes.c_str(), true); theReader->parse(theInputSource) delete theReader; theTransformer.transform(*theBuilder, theStylesheet, theResultTarget); } Of course, since I'm not really re-using the parser, it doesn't used the cached grammar, but it gives you an idea of how you can do this. The only drawback is that document brought into the transformation through the document() function will not use this parser instance, and so will not use the cached grammar. Dave -----Original Message----- From: Thomas Cherel Until it gets added to Xalan, is there any way I can use the Xerces interface directly? For example, today, I can provide to Xalan an already parsed document (a DOM tree). Can I use the new Xerces API to generate such a DOM tree (and reuse schema/grammar for the validation that will be done at that time), and then pass it to Xalan (that will take care of the XSLT processing only)? Thomas -----Original Message----- From: David N Bertoni/Cambridge/IBM [mailto:david_n_bertoni@;us.ibm.com] Sent: Tuesday, October 22, 2002 1:19 PM To: [EMAIL PROTECTED] Subject: Re: Schema validation performance Hi Thomas, With the latest Xerces, you can prime a parser instance with a particular schema, then have it re-use that schema over and over again. You can also have it re-use a grammar for every document it parses. However, these interfaces are new and still experimental, so I don't have much experience using them. We don't expose lots of the Xerces parser interfaces because it gets very burdensome to do so. However, this one is probably worth doing, so you might want to enter a Bugzilla request for an enhancement. Dave -----Original Message----- From: Thomas Cherel When processing an XML document (applying a style sheet), I can turn on the validation of the XML document against its schema. Is there any way (or may be this is already done under the cover) to cache the XML schema for validation of other XML documents? What I mean is that if I process a bunch of XML documents in sequence, and all of them are using the same XML schema, it will be nice if the schema is downloaded and analyzed only once instead of for each document.
