I have a question regarding the array of QName objects that are
allocated for each row within the ElemStack class: is it necessary to
maintain a list of all these children for checking whether a document is
well-formed or is this array of children only necessary for checking if
a document is valid (or neither)?
I have implemented a simple check within the ElemStack::addChild()
method which checks to see that there exists at least one child in the
array and if validation is turned off, then it deletes the last child in
the array. This check takes place just before the next child is added
to the array. As a result, only one child will be in the array at any
given time (if validation is turned off). Is this reasonable, or is
there a greater need for that array of children?
Mark N.
-----Original Message-----
From: Mark Northcott
Sent: Thursday, August 23, 2001 8:28 AM
To: [EMAIL PROTECTED]
Subject: RE: Wellformedness checking
I agree, after looking at the source code, I too thought that this was
how the ElemStack was supposed to work. I also tried turning off
validation, but experienced the same results. Then, using BoundsChecker
I monitored the memory usage, and memory was continually being allocated
within the ElemStack::addChild() method, and this wouldn't be released
until the parsing was complete. As an example, I parsed a 40M XML file,
and the memory usage rose as high as 13M while parsing. This may not
seem too bad, but our product needs to be able to handle XML data
sources which are much larger than 40M.
All memory problems were fixed once I forced Xerces not to pop elements
onto the ElemStack, therefore, I can only assume this must be the source
of the problem. Whether it is a bug within Xerces that the stack is not
popping elements off as it should, I don't know. I would be willing to
look further into this if I could be assured that it is indeed a bug.
-----Original Message-----
From: Dean Roddey [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, August 22, 2001 4:59 PM
To: [EMAIL PROTECTED]
Subject: Re: Wellformedness checking
> I am also looking into this and trying to remove the use of QName in
> ElemStack to improve performance .... No concrete code yet, but
working
on
> it. Stay tuned.
>
Ok, I happened to notice this one and I'll come out retirement for one
message... The ElemStack should not have any effect in the scenario that
he
put forward. He said he had 'lots of direct children of the root', but
the
element stack is only used to check for nesting errors and such, so it
never
gets more elements than the deepest nesting of elements in the document.
So
it would never have more than a handful of elements in it in that case,
even
if there a billion direct children of the root.
So if his document is really structured like he says, then he cannot
possibly be getting this result from the element stack, unless that code
is
broken and not popping off as its supposed to. Now, where he could get
such
a problem is if validation is on, because some content model is going to
have to track all of the element ids of the children of the root, in
order
to validate it at the end.
But otherwise, something is not right here and some wires are crossed.
--------------------------
Dean Roddey
The Charmed Quark Controller
Charmed Quark Software
[EMAIL PROTECTED]
http://www.charmedquark.com
"If it don't have a control port, don't buy it!"
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]