Hi,

Further testing is yielding strange results... I've patched in the
code specified in Bug #1329 (SAX2XMLReader leaking XMLBuffers).  This
has had no noticeable effect in speed.  By the way, why have the
changes specified in this bug not made their way into the release
code?

I've also changed my program to re-created the SAX parser every 100
files or so and it's still slowing down.

At a bit of a loss,
Jamu.

On Mon, Jun 18, 2001 at 11:27:49AM -0700, [EMAIL PROTECTED] wrote:
> Hi,
> 
> I'm using the SAX2XMLReader in an application that reuses the parser
> as it reads in several thousand XHTML files.  I'm experiencing very
> fast parsing of the first few files but as it goes the time to parse
> gets longer and longer.  The files are all roughly the same size.
> Initially files are parsed 60-200ms each.  By the time about 800 files
> have been processed parse time is as slow as 30000ms!  I'm using a CVS
> checkout from 29 May 2001.  Some questions:
> 
> 1. I'm using an EntityResolver to use my xhtml-transitional.dtd off
> disk.  In my SaxParser constructor I create a new LocalFileInputSource:
> 
>   const XMLCh file[] = 
>   {
>     // This says: xhtml1-transitional.dtd\0
>     chLatin_x, chLatin_h, chLatin_t, chLatin_m, chLatin_l, chDigit_1, chDash,
>     chLatin_t, chLatin_r, chLatin_a, chLatin_n, chLatin_s, chLatin_i, 
>     chLatin_t, chLatin_i, chLatin_o, chLatin_n, chLatin_a, chLatin_l,
>     chPeriod,  chLatin_d, chLatin_t, chLatin_d, chNull
>   };
> 
>   dtdBuffer = new LocalFileInputSource (file);
> 
> My entity resolver does the following:
> 
> InputSource*
> Sax2Tokenizer::resolveEntity (const XMLCh* const, // publicId
>                               const XMLCh* const) // systemId
> {
>   return dtdBuffer;
> }
> 
> After the first parse I call parser->parse (memBuffer, true) to reuse
> the data from the entity resolver.  Is this the right way to do this?
> I've been assuming that I own the dtdBuffer and am responsible for
> deleting it.  However, when I call 'delete dtdBuffer;' in my
> destructor my application segfaults.  Another observation is that my
> code leaks very slowly as the parse-cycle runs.  I've not used a
> profiler yet but feel fairly confident that I've eliminated any of my
> own leaks.  Also, files with many entities take several orders of
> magnitude longer to parse.
> 
> 2. I noticed when using the DOMParser that there was a reset method to
> be called between every parse.  Am I missing something like this for
> the SAX2 parser?
> 
> 3. I get strange compiler messages when compiling my code that I don't
> get when compiling the sample code (on Debian Linux 2.4.3-k7, gcc
> 2.95.4):
> 
> /usr/include/util/NameIdPool.c: In method 
>`NameIdPoolEnumerator<DTDElementDecl>::NameIdPoolEnumerator(const 
>NameIdPoolEnumerator<DTDElementDecl> &)':
> /usr/include/validators/DTD/DTDGrammar.hpp:227:   instantiated from here
> /usr/include/util/NameIdPool.c:358: warning: base class `class 
>XMLEnumerator<DTDElementDecl>' should be explicitly initialized in the copy 
>constructor
> /usr/include/util/NameIdPool.c: In method 
>`NameIdPoolEnumerator<XMLNotationDecl>::NameIdPoolEnumerator(const 
>NameIdPoolEnumerator<XMLNotationDecl> &)':
> /usr/include/validators/DTD/DTDGrammar.hpp:233:   instantiated from here
> /usr/include/util/NameIdPool.c:358: warning: base class `class 
>XMLEnumerator<XMLNotationDecl>' should be explicitly initialized in the copy 
>constructor
> /usr/include/util/NameIdPool.c: In method 
>`NameIdPoolEnumerator<DTDEntityDecl>::NameIdPoolEnumerator(const 
>NameIdPoolEnumerator<DTDEntityDecl> &)':
> /usr/include/internal/XMLScanner.hpp:947:   instantiated from here
> /usr/include/util/NameIdPool.c:358: warning: base class `class 
>XMLEnumerator<DTDEntityDecl>' should be explicitly initialized in the copy constructor
> 
> Any help/suggestions would be very helpful.
> 
> Thanks,
> Jamu.
> 
> -- 
> Jamu Kakar (Developer)                        Expressus Design Studio, Inc.
> [EMAIL PROTECTED]                  708-1641 Lonsdale Avenue
> V: (604) 903-6994                     North Vancouver, BC, V7M 2J5
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

-- 
Jamu Kakar (Developer)                  Expressus Design Studio, Inc.
[EMAIL PROTECTED]                    708-1641 Lonsdale Avenue
V: (604) 903-6994                       North Vancouver, BC, V7M 2J5

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to