DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=897>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=897 Memory leak reading large XML-files with SAX parser ------- Additional Comments From [EMAIL PROTECTED] 2003-11-10 04:01 ------- This is still a problem in the lastest version of Xerces (2.5). The number "java.io.StringReader" increases until it runs out of memory - they are never able to be garbage collected. Here's some sample RDF/XML:<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE rdf:RDF [ <!ENTITY math "http://kowari.org/math#"> <!ENTITY owl "http://www.w3.org/2002/07/owl#"> <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#"> <!ENTITY xsd "http://www.w3.org/2001/XMLSchema#"> ]> <rdf:RDF xmlns:math ="&math;" xmlns:owl ="&owl;" xmlns:rdf ="&rdf;" xmlns:rdfs ="&rdfs;"> <rdf:Description> <owl:sameIndividualAs rdf:datatype="&xsd;integer">14</owl:sameIndividualAs> <rdfs:label xml:lang="en">fourteen</rdfs:label> <math:roman>XIV</math:roman> <math:square rdf:datatype="&xsd;integer">196</math:square> <math:primeFactorization> <rdf:Bag> <rdf:li rdf:datatype="&xsd;integer">2</rdf:li> <rdf:li rdf:datatype="&xsd;integer">7</rdf:li> </rdf:Bag> </math:primeFactorization> </rdf:Description> <rdf:Description> <owl:sameIndividualAs rdf:datatype="&xsd;integer">15</owl:sameIndividualAs> <rdfs:label xml:lang="en">fifteen</rdfs:label> <math:roman>XV</math:roman> <math:square rdf:datatype="&xsd;integer">225</math:square> <math:primeFactorization> <rdf:Bag> <rdf:li rdf:datatype="&xsd;integer">3</rdf:li> <rdf:li rdf:datatype="&xsd;integer">5</rdf:li> </rdf:Bag> </math:primeFactorization> <rdf:type rdf:resource="&math;TriangularNumber"/> </rdf:Description> </rdf:RDF> When you inline all of the references, then it only ever has 4 objects allocated. For example: <rdf:Description> <owl:sameIndividualAs rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">14</owl:sameIndividualAs> <rdfs:label xml:lang="en">fourteen</rdfs:label> <math:roman>XIV</math:roman> <math:square rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">196</math:square> <math:primeFactorization> <rdf:Bag> <rdf:li rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">2</rdf:li> <rdf:li rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">7</rdf:li> </rdf:Bag> </math:primeFactorization> </rdf:Description> Here's a report from Optimize It after parsing a large amount of this XML: 2509 instances of java.io.StringReader allocated. 100.0% org.apache.xerces.impl.XMLEntityManager.startEntity() 100.0% org.apache.xerces.impl.XMLScanner.scanAttributeValue() 100.0% org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanAttribute() 100.0% org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement() 99.84% org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch() 99.84% org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument() 99.84% org.apache.xerces.parsers.DTDConfiguration.parse() --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
