Ok, so I was forgetting a little bit of stuff. The element stack *is* also
used to store information that is only used when validating. The stack
itself just follows the nesting of elements. However, at each nesting level,
we keep some information about all of the children encountered so far. At
the end of the current element, we pass that list to the validator for
validation. As of the most recent release, even more information is now
being stored in there which has probably increased the memory used a lot
(i.e. the QNames are being stored instead of just storing pool ids, which is
what was being stored before.)

I believe that this information could be left out if not validating, but
that would have to be carefully checked.

--------------------------
Dean Roddey
The Charmed Quark Controller
Charmed Quark Software
[EMAIL PROTECTED]
http://www.charmedquark.com

"If it don't have a control port, don't buy it!"


----- Original Message -----
From: "Mark Northcott" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, August 23, 2001 12:49 PM
Subject: RE: Wellformedness checking


> I have a question regarding the array of QName objects that are
> allocated for each row within the ElemStack class:  is it necessary to
> maintain a list of all these children for checking whether a document is
> well-formed or is this array of children only necessary for checking if
> a document is valid (or neither)?
>
> I have implemented a simple check within the ElemStack::addChild()
> method which checks to see that there exists at least one child in the
> array and if validation is turned off, then it deletes the last child in
> the array.  This check takes place just before the next child is added
> to the array.  As a result, only one child will be in the array at any
> given time (if validation is turned off).  Is this reasonable, or is
> there a greater need for that array of children?
>
> Mark N.
>
> -----Original Message-----
> From: Mark Northcott
> Sent: Thursday, August 23, 2001 8:28 AM
> To: [EMAIL PROTECTED]
> Subject: RE: Wellformedness checking
>
>
> I agree, after looking at the source code, I too thought that this was
> how the ElemStack was supposed to work.  I also tried turning off
> validation, but experienced the same results.  Then, using BoundsChecker
> I monitored the memory usage, and memory was continually being allocated
> within the ElemStack::addChild() method, and this wouldn't be released
> until the parsing was complete.  As an example, I parsed a 40M XML file,
> and the memory usage rose as high as 13M while parsing.  This may not
> seem too bad, but our product needs to be able to handle XML data
> sources which are much larger than 40M.
>
> All memory problems were fixed once I forced Xerces not to pop elements
> onto the ElemStack, therefore, I can only assume this must be the source
> of the problem.  Whether it is a bug within Xerces that the stack is not
> popping elements off as it should, I don't know.  I would be willing to
> look further into this if I could be assured that it is indeed a bug.
>
>
> -----Original Message-----
> From: Dean Roddey [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, August 22, 2001 4:59 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Wellformedness checking
>
>
> > I am also looking into this and trying to remove the use of QName in
> > ElemStack to improve performance ....  No concrete code yet, but
> working
> on
> > it.  Stay tuned.
> >
>
> Ok, I happened to notice this one and I'll come out retirement for one
> message... The ElemStack should not have any effect in the scenario that
> he
> put forward. He said he had 'lots of direct children of the root', but
> the
> element stack is only used to check for nesting errors and such, so it
> never
> gets more elements than the deepest nesting of elements in the document.
> So
> it would never have more than a handful of elements in it in that case,
> even
> if there a billion direct children of the root.
>
> So if his document is really structured like he says, then he cannot
> possibly be getting this result from the element stack, unless that code
> is
> broken and not popping off as its supposed to. Now, where he could get
> such
> a problem is if validation is on, because some content model is going to
> have to track all of the element ids of the children of the root, in
> order
> to validate it at the end.
>
>
>
> But otherwise, something is not right here and some wires are crossed.
>
> --------------------------
> Dean Roddey
> The Charmed Quark Controller
> Charmed Quark Software
> [EMAIL PROTECTED]
> http://www.charmedquark.com
>
> "If it don't have a control port, don't buy it!"
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to