Hello Enric, depending on the needs of your application it may be an alternative approach to combine a Stax-compliant (JSR-173) parser like woodstox (see http://woodstox.codehaus.org/) and XPath
1. Parse the tree with woodstox. 2. For small subtrees build a (J)Dom-Tree. 3. Use XPath to select nodes from the subtree. We used this approach to semantically compare two BMEcat messages (see http://www.bmecat.org). It has been tested the comparison of two 900 MByte files. Regards, Andreas >>On Fri, 13 Jan 2006, Karr, David wrote: >> How many nodes is your Xpath expression returning? If you're >> essentially returning the vast majority of the nodes in the file, then >> you're probably using the wrong tool for this job. That is, don't use >> Xpath for this. >The curious think is that my XPath expression doesn't return any node. So I guess XPath needs to build a DOM tree to do >its job, even if returns nothing (could someone confirm this?) >Thanks to those who provided pointers to other tools. After googling a bit, I found a commercial product ( http://www.eweek.com/article2/0,1759,1780265,00.asp) where they say can process a 1TB file by doing streaming instead of >DOM. I found also 'exist', an open source native XML database (http://exist.sourceforge.net/) where they say can work with >documents with up to 2^63 nodes. >>Regards, >>-Enric >> >> > -----Original Message----- >> > From: Enric Jaen [mailto:[EMAIL PROTECTED] >> > >> > >If you think there is bugs in the impl of XPath, please open a bug >> > >report at https://issues.apache.org/jira/secure/Dashboard.jspa >> > >and attach a valid test case that can demonstrate the problem. >> > >> > I don't think is a bug. I rather think that XPATH builds a >> > DOM tree when returns a NodeSet (please correct me if I am >> > wrong). When the file is about 6MB the java memory crashes. >> > Two workarounds I have tried are to increase heap and divide >> > the xml file. Both solutions bring the evaluation limit >> > farther, but there is still a limit. >> > >> > I think it would be possible an XPATH implementation for SAX, >> > such as Sequential XPATH, but I haven't gone deelply into this. >> > >> > -Enric >> > >> > On Fri, 13 Jan 2006, Enric Jaen wrote: >> > >> > > Hello, I got an OutOfMemory when I evaluate an XPATH expression a >> > > large XML file. >> > > >> > > I am using this code: >> > > >> > > XPathFactory factory = XPathFactory.newInstance(); >> > > XPath xpath = factory.newXPath(); >> > > InputSource entities_is=new InputSource("file.xml"); >> > > XPathExpression xpathExpr = xpath.compile(expr); >> > > return (NodeList)xpathExpr.evaluate(entities_is, >> > > XPathConstants.NODESET); >> > > >> > > I am not an expert of XPATH development, therefore I'd >> > appreciate if >> > > someone could give me an explanation of why is this error >> > happening. >> > > Is this because XPATH uses DOM internaly? If so, is there any >> > > implementation for XPATH for SAX? Is there any other >> > > explanation/solution? >> > > >> > > Thanks in advance for your help. >> > > -Enric >> >>