John,

I think the problem here is that the node value iterator needs to be
able to restart its source iterator at any time. This is similar to
the way a step iterator needs to force its '_iterator' iterator to
be restartable. In your case SI3 will keep restarting NVI for each
node it gets from SI2. NVI then needs to do a complete restart of
SI4 for each time this happens. I have implemented a fix for this and
I will integrate it into the source tree shortly.

Morten

John Howard wrote:
> 
> Hi again all,
> 
> In this installment of the neverending NodeIterator-question series, I'll
> discuss the tantilizing details of bug
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5152.
> 
> The path "/a/b[@name = 'foo']/c/text()" compiles to (mostly) form (the / at
> the front adds a root AbsoluteIterator - I'll ignore that at my peril):
> 
>              SI1
>             /   \
>          SI2     TCI4
>         /   \
>      SI3    TCI3
>     /   \
> TCI1   NVI   legend:
>        /     SI - StepIterator
>      SI4     TCI - TypedChildIterator
>     /   \    NVI - NodeValueIterator
> TCI2    TAI  TAI - TypedAttributeIterator
> 
> (for StepIterators / is the _source and \ is the _iterator)
> 
> Consider the following code from bug#5152:
> 
> par<xsl:value-of select="$par"/>par
> var<xsl:value-of select="$var"/>var
> 
> As reported in the bug, $par will generate output whereas $var will not.
> 
> In the $par case, BasisLibrary.stringF(Object, DOM) is passed an uncloned
> iterator tree, resets it, and uses next() to grab the first value (see code
> below):
> 
> return dom.getNodeValue(((NodeIterator)obj).reset().next());
> 
> In the $var case, a (different) uncloned iterator tree is reset, cloned and
> next() is used to grab the first value (see code below):
> 
> int i_5_ = nodeiterator_0_.reset().cloneIterator().next();
> String string = dom_6_.getNodeValue(i_5_);
> 
> It would appear that cloneIterator is causing some problems. First, let me
> relate how understand what the thing does: cloneIterator serves to make a
> throw-away copy of an iterator. All it's objects and all their states are
> copied so that the iterator can then be used (and in the course of being
> used, modified).
> 
> cloneIterator works by cloning 'this', then calling 'clone.reset()': reset()
> is called after each subtree is cloned. I think I can summarise the
> cloning/resetting of the iterator tree in bug #5152 as objects are cloned
> (@) and then each clone is reset (#). This stuff happens in the following
> order: SI1@ SI2@ SI3@ TCI1@ TCI1# NVI@ SI4@ TCI2@ TCI2# TAI@ TAI# SI4# NVI#
> SI3# TCI3@ TCI3# SI2# TCI4@ TCI4@ SI1#.
> 
> Notice that after TAI# we are unwinding from the deepest part of the
> callstack. Notice also that as we unwind from SI4# we then execute NVI#.
> >From looking at the code, we see that NodeValueIterator.cloneIterator() is
> (mostly) implemented as (_source is object SI4):
> 
> NodeValueIterator clone = (NodeValueIterator)super.clone();
> clone._source = _source.cloneIterator();
> return clone.reset();
> 
> and that even-though _source.cloneIterator performs a reset, since
> NodeValueIterator.reset is implemented as:
> 
> _source.reset();
> return resetPosition();
> 
> clone.reset() will call SI4's reset function again. Thus it appears that
> every cloned object reset()s itself and then if it's not the root of the
> iterator tree, when it's parent is cloned it's reset() called again
> (setStartNode(int) would be called if the object is an _iterator of a
> StepIterator).
> 
> Looking at the example above, it appears that by the time the root is cloned
> and told to reset(), TCI2 has been told to reset/set-its-starting-node as
> many times as the tree is deep minus one (5 times).
> 
> Besides being a performance issue, I'll wager that this mutltipe-resetting
> behaviour is the source of problem for bug #5152. I'll claim that the
> iterator tree above appears to break during the _second_ initialization of
> TCI3 (ie: during SI1#). Starting from SI4#, I'll outline the codepath:
> 
> 1. During SI4#, TAI._attribute is correctly set but TCI2.next moves to 0
> 2. SI2# calls SI2.reset (by way of clone.reset)
> 3. SI2.reset calls SI3.next to initialize TCI3 (and succeeds), and during
> the call TCI1.next moves to 0 and TAI._attribute is consumed (it is now 0 as
> well).
> 4. SI1# calls SI2.reset (by way of SI1.reset)
> 5. SI2.reset calls SI3.next to initialize TCI3
> 6. SI3.next fails to get a node from either _iterator (by way of SI4.next
> failing - itself failing from TAI and TCI2 being 0) or _source (TCI1 has no
> more nodes)
> 
> Since SI3.next returned 0, TCI3 is set to 0, SI2.next returns 0, TCI4 is set
> to 0 and SI1.next returns 0 thereafter.
> 
> Where I see things falling apart is in SI3.next (StepIterator.javaV1.9L140)
> during SI2#. Could a fix be to check (during cloning only) if _source came
> to END then _source could be reset and _iterator was reinitialized? This
> would be a pretty ugly patch.
> 
> Perhaps another way would be to have a final method in NodeIteratorBase that
> would start the cloning of the iterator tree and then reset it - once only.
> Like the way SyntaxTreeNode.parseContents/Children works - uh, only
> different. If the main entrypoint was still called cloneIterator then only
> the names of the iterator methods that perform the cloning would need to
> change - nothing in the compiler-proper.
> 
> Thoughts? Please straighten me out if things look crooked,
> 
> john

Reply via email to