>>>>> "Jason" == Jason E Stewart <[EMAIL PROTECTED]> writes:

Jason> [EMAIL PROTECTED] (Jason E. Stewart) writes:
>> Brian Faull <[EMAIL PROTECTED]> writes:
>> 
>> Depends on what you are doing - if it is DOM related, then yes, you
>> must tell the parser to release the memory, otherwise it grows. From
>> the API docs:
>> 
>> void AbstractDOMParser::resetDocumentPool()
>> 
>> Reset the documents vector pool and release all the associated memory
>> back to the system.
>> 
>> When parsing a document using a DOM parser, all memory allocated for
>> a DOM tree is associated to the DOM document.
>> 
>> If you do multiple parse using the same DOM parser instance, then
>> multiple DOM documents will be generated and saved in a vector
>> pool. All these documents (and thus all the allocated memory) won't
>> be deleted until the parser instance is destroyed.
>> 
>> If you don't need these DOM documents anymore and don't want to
>> destroy the DOM parser instance at this moment, then you can call
>> this method to reset the document vector pool and release all the
>> allocated memory back to the system.

Jason> As a note - if you create a new parser each time, this should
Jason> *not* cause a leak:

Jason> while (1) { my $parser = XML::Xerces::XercesDOMParser->new();
Jason> $parser->parse(XML::Xerces::MemBufInputSource->new('<test/>')); }

Jason> if it does, that's a *big* problem, and I'd like to know about
Jason> it.

Hi Jason,

I've been meaning to compose an e-mail about this for a few days now,
but just haven't gotten around to it.  You might not like to hear this,
but I think there is a *big* problem.

I have a script which pulls data out of a database and formats it as
XML.  There is ~2.4Gb of XML once it is done.  The code pulls the data
out in chunks of reasonable size (~15Kb each as XML), formats each chunk
as an individual XML document, optionally validates the document against
a schema, and then prints it out.

When the XML validation is turned on, the script gradually eats memory
until it crashes.  If validation is off, the script runs fine.  I have
tried everything I can think of to get the memory to be released, but
with no success.

I am *definitely* creating a new parser every time.  Here's the sub that
does the validation:

sub validateXML {
  my $xml = shift ;

  # Just to make sure there is only one, $Parser is global but it's not used anywhere 
else:
  $Parser = XML::Xerces::XMLReaderFactory::createXMLReader() ;
  $Parser->setFeature("http://xml.org/sax/features/namespaces";, 1) ;
  $Parser->setFeature("http://apache.org/xml/features/validation/schema";, 1) ;
  
$Parser->setFeature("http://apache.org/xml/features/validation/schema-full-checking";, 
1) ;
  $Parser->setFeature("http://apache.org/xml/features/validation-error-as-fatal";, 1) ;
  $Parser->setFeature("http://xml.org/sax/features/validation";, 1) ;
  $Parser->setFeature("http://apache.org/xml/features/validation/dynamic";, 0) ;
  my $errorHandler = new XML::Xerces::PerlErrorHandler() ;
  $Parser->setErrorHandler($errorHandler) ;
  my $contentHandler = new XML::Xerces::PerlContentHandler() ;
  $Parser->setContentHandler($contentHandler) ;

  eval {
    $Parser->parse( XML::Xerces::MemBufInputSource->new($xml) ) ;
  } ;
  undef $Parser ; # reclaim resources??
  if ($@) {
    return (0, $@) ;
  } else {
    return (1, '') ;
  }
}

Thoughts, ideas...?

Steve
-- 
(    Stephen L. Mathias, Ph.D.                     (                    (
 )   Office of Biocomputing                         )  s m a t h i a s   )
(    University of New Mexico School of Medicine   (   @ p o b l a n o  (
 )   MSC08 4560                                     )  . h e a l t h .   )
(    1 University of New Mexico                    (   u n m . e d u    (
 )   Albuquerque, NM 87131-0001                     )                    )

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to