On Mon, 16 Jan 2006 [EMAIL PROTECTED] wrote:
> Hello Enric, > > depending on the needs of your application it may be an alternative > approach to combine a Stax-compliant (JSR-173) parser like woodstox (see > http://woodstox.codehaus.org/) and XPath > > 1. Parse the tree with woodstox. > > 2. For small subtrees build a (J)Dom-Tree. > > 3. Use XPath to select nodes from the subtree. I have not clear something: How can I tell Xalan-XPATH to use the tree generated by woodstrox ? I thought that Xalan-XPATH creates its own DOM tree from InputSource to select the nodes. Thanks for any clarification on this.. -Enric > > We used this approach to semantically compare two BMEcat messages > (see http://www.bmecat.org). It has been tested the comparison of > two 900 MByte files. > > Regards, > > Andreas > > > >>On Fri, 13 Jan 2006, Karr, David wrote: > > >> How many nodes is your Xpath expression returning? If you're > >> essentially returning the vast majority of the nodes in the file, then > >> you're probably using the wrong tool for this job. That is, don't use > >> Xpath for this. > > >The curious think is that my XPath expression doesn't return any node. So > I guess XPath needs to build a DOM tree to do >its job, even if returns > nothing (could someone confirm this?) > > >Thanks to those who provided pointers to other tools. After googling a > bit, I found a commercial product ( > http://www.eweek.com/article2/0,1759,1780265,00.asp) where they say can > process a 1TB file by doing streaming instead of >DOM. I found also > 'exist', an open source native XML database (http://exist.sourceforge.net/) > where they say can work with >documents with up to 2^63 nodes. > > >>Regards, > > >>-Enric > > >> > >> > -----Original Message----- > >> > From: Enric Jaen [mailto:[EMAIL PROTECTED] > >> > > >> > >If you think there is bugs in the impl of XPath, please open a bug > >> > >report at https://issues.apache.org/jira/secure/Dashboard.jspa > >> > >and attach a valid test case that can demonstrate the problem. > >> > > >> > I don't think is a bug. I rather think that XPATH builds a > >> > DOM tree when returns a NodeSet (please correct me if I am > >> > wrong). When the file is about 6MB the java memory crashes. > >> > Two workarounds I have tried are to increase heap and divide > >> > the xml file. Both solutions bring the evaluation limit > >> > farther, but there is still a limit. > >> > > >> > I think it would be possible an XPATH implementation for SAX, > >> > such as Sequential XPATH, but I haven't gone deelply into this. > >> > > >> > -Enric > >> > > >> > On Fri, 13 Jan 2006, Enric Jaen wrote: > >> > > >> > > Hello, I got an OutOfMemory when I evaluate an XPATH expression a > >> > > large XML file. > >> > > > >> > > I am using this code: > >> > > > >> > > XPathFactory factory = XPathFactory.newInstance(); > >> > > XPath xpath = factory.newXPath(); > >> > > InputSource entities_is=new InputSource("file.xml"); > >> > > XPathExpression xpathExpr = xpath.compile(expr); > >> > > return (NodeList)xpathExpr.evaluate(entities_is, > >> > > XPathConstants.NODESET); > >> > > > >> > > I am not an expert of XPATH development, therefore I'd > >> > appreciate if > >> > > someone could give me an explanation of why is this error > >> > happening. > >> > > Is this because XPATH uses DOM internaly? If so, is there any > >> > > implementation for XPATH for SAX? Is there any other > >> > > explanation/solution? > >> > > > >> > > Thanks in advance for your help. > >> > > -Enric > >> > >> > > > >