For point #1 below, I have dug around a bit more and the code is relatively easy to understand (CachedXpathApi). I can basically mimic the behaviour of the CachedXpathApi myself and this allows me to precompile my Xpath and also I can just create XpathContext(false) and call reset() on it, just like the code does. This will work around my problem for #1.
CachedXpathApi() { xpathSupport = new CachedXpathApi(false); } public XObject eval(Node contextNode, String str, Node namespaceNode) throws TransformerException { PrefixResolverDefault prefixResolver = new PrefixResolverDefault(((Node) (namespaceNode.getNodeType() != 9 ? namespaceNode : ((Node) (((Document)namespaceNode).getDocumentElement()))))); XPath xpath = new XPath(str, null, prefixResolver, 0, null); int ctxtNode = xpathSupport.getDTMHandleFromNode(contextNode); return xpath.execute(xpathSupport, ctxtNode, prefixResolver); } --- On Wed, 9/30/09, Sandeep Takhar <sandeep_tak...@yahoo.com> wrote: > From: Sandeep Takhar <sandeep_tak...@yahoo.com> > Subject: Getting optimal performance for CachedXpathApi searches and DOM > parsing > To: xalan-j-users@xml.apache.org > Received: Wednesday, September 30, 2009, 9:24 PM > Hi. > > I am new to the list and sorry if I get something > wrong. I searched the existing archives, but could not > find the answer I am looking for. Thanks for any help > you have. > > We are seeing some small performance issues in our > production environment. I am busy seeing if I can > reproduce them. My questions are regarding these > issues and possibly working around them. > > 1. CachedXpathApi is working well for us. I do see that the > constructor is expensive. I read on some IBM > documentation that if you don't reuse parsers..the > constructor is expensive because it tries to find a factory > via jar file location (looping through the classpath). > In fact I see this happening when I look at the thread > dump. I see all the documentation and comments that > indicate if you change a source document, then you need a > new CachedXpathApi. I see two things that I am not > sure may work. > > a) cachedXpathApi.getXpathContext().reset(). Can I > call this instead of creating a new CachedXpathApi and then > using the cachedXpathApi object to search new > document? I will still make the cachedXpathApi only > execute in a single thread, but I don't want to call the > constructor. > b)There is also a constructor which takes another > CachedXpathApi(CachedXPathApi cachedXpathApi). Is this > of some use? > c) Can I use compiled xpath expressions and achieve a > similar effect as the CachedXpathApi somehow...where I don't > have to construct the CachedXpathApi()? Does someone > have a quick sample? > > 2. We are using the apache DOM parser. Basically the > DOMParser is the default apache one (version 2.7.0). > Currently we create the Builder from the factory and then > call parse. > DocumentBuilderFactory.newDocumentBuilder()....I cannot > remember exactly the syntax. We can certainly not have > to create the DocumentBuilder each time and I am suggesting > this as a fix. What I see happen is that there is some > minor performance issues that happen in production in the > parse method. I don't have the execution thread handy, > but it is always spending time on > DocumentScanner$DTDDispatcher.dispatch method (may not be > exact syntax) Looks like all methods for the > declaration handler. I have to double check that, but > the methods are named like the sax declaration handler...but > we are not using SAX. We have an entityId as the > second line in the xml file that points to a DTD that only > has a single line in the DTD file. I've tried to > understand the code, but haven't > been able to figure it out > > a) Will I stop seeing DTDDispatcher time taken in the > threads if I remove the entity line in the source xml (no > reference to a DTD)? > b) Is what I'm seeing completely normal? > c) can I set a property that will turn off the > DTDDispatcher.dispatch method from being called? > > > > Here are the methods I see that are spending time and > things like string.intern(). Maybe it is completely > normal, but it doesn't explain why it is slower in > production...except that production load may be causing > it...and that would be fine. > > public void elementDecl(String name, String model) > throws SAXException; > public void attributeDecl(String elementName, > String attributeName, String type, String > mode, > String defaultValue) throws > SAXException; > public void internalEntityDecl(String name, String > value) > throws SAXException; > public void externalEntityDecl(String name, String > publicID, > String systemID) throws SAXException; > > > > > > > __________________________________________________________________ > Yahoo! Canada Toolbar: Search from anywhere on the web, and > bookmark your favourite sites. Download it now > http://ca.toolbar.yahoo.com. > __________________________________________________________________ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com.