>>>>> "Jason" == Jason E Stewart <[EMAIL PROTECTED]> writes:
Jason> [EMAIL PROTECTED] (Jason E. Stewart) writes:
>> Brian Faull <[EMAIL PROTECTED]> writes:
>>
>> Depends on what you are doing - if it is DOM related, then yes, you
>> must tell the parser to release the memory, otherwise it grows. From
>> the API docs:
>>
>> void AbstractDOMParser::resetDocumentPool()
>>
>> Reset the documents vector pool and release all the associated memory
>> back to the system.
>>
>> When parsing a document using a DOM parser, all memory allocated for
>> a DOM tree is associated to the DOM document.
>>
>> If you do multiple parse using the same DOM parser instance, then
>> multiple DOM documents will be generated and saved in a vector
>> pool. All these documents (and thus all the allocated memory) won't
>> be deleted until the parser instance is destroyed.
>>
>> If you don't need these DOM documents anymore and don't want to
>> destroy the DOM parser instance at this moment, then you can call
>> this method to reset the document vector pool and release all the
>> allocated memory back to the system.
Jason> As a note - if you create a new parser each time, this should
Jason> *not* cause a leak:
Jason> while (1) { my $parser = XML::Xerces::XercesDOMParser->new();
Jason> $parser->parse(XML::Xerces::MemBufInputSource->new('<test/>')); }
Jason> if it does, that's a *big* problem, and I'd like to know about
Jason> it.
Hi Jason,
I've been meaning to compose an e-mail about this for a few days now,
but just haven't gotten around to it. You might not like to hear this,
but I think there is a *big* problem.
I have a script which pulls data out of a database and formats it as
XML. There is ~2.4Gb of XML once it is done. The code pulls the data
out in chunks of reasonable size (~15Kb each as XML), formats each chunk
as an individual XML document, optionally validates the document against
a schema, and then prints it out.
When the XML validation is turned on, the script gradually eats memory
until it crashes. If validation is off, the script runs fine. I have
tried everything I can think of to get the memory to be released, but
with no success.
I am *definitely* creating a new parser every time. Here's the sub that
does the validation:
sub validateXML {
my $xml = shift ;
# Just to make sure there is only one, $Parser is global but it's not used anywhere
else:
$Parser = XML::Xerces::XMLReaderFactory::createXMLReader() ;
$Parser->setFeature("http://xml.org/sax/features/namespaces", 1) ;
$Parser->setFeature("http://apache.org/xml/features/validation/schema", 1) ;
$Parser->setFeature("http://apache.org/xml/features/validation/schema-full-checking",
1) ;
$Parser->setFeature("http://apache.org/xml/features/validation-error-as-fatal", 1) ;
$Parser->setFeature("http://xml.org/sax/features/validation", 1) ;
$Parser->setFeature("http://apache.org/xml/features/validation/dynamic", 0) ;
my $errorHandler = new XML::Xerces::PerlErrorHandler() ;
$Parser->setErrorHandler($errorHandler) ;
my $contentHandler = new XML::Xerces::PerlContentHandler() ;
$Parser->setContentHandler($contentHandler) ;
eval {
$Parser->parse( XML::Xerces::MemBufInputSource->new($xml) ) ;
} ;
undef $Parser ; # reclaim resources??
if ($@) {
return (0, $@) ;
} else {
return (1, '') ;
}
}
Thoughts, ideas...?
Steve
--
( Stephen L. Mathias, Ph.D. ( (
) Office of Biocomputing ) s m a t h i a s )
( University of New Mexico School of Medicine ( @ p o b l a n o (
) MSC08 4560 ) . h e a l t h . )
( 1 University of New Mexico ( u n m . e d u (
) Albuquerque, NM 87131-0001 ) )
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]