John, I think the problem here is that the node value iterator needs to be able to restart its source iterator at any time. This is similar to the way a step iterator needs to force its '_iterator' iterator to be restartable. In your case SI3 will keep restarting NVI for each node it gets from SI2. NVI then needs to do a complete restart of SI4 for each time this happens. I have implemented a fix for this and I will integrate it into the source tree shortly.
Morten John Howard wrote: > > Hi again all, > > In this installment of the neverending NodeIterator-question series, I'll > discuss the tantilizing details of bug > http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5152. > > The path "/a/b[@name = 'foo']/c/text()" compiles to (mostly) form (the / at > the front adds a root AbsoluteIterator - I'll ignore that at my peril): > > SI1 > / \ > SI2 TCI4 > / \ > SI3 TCI3 > / \ > TCI1 NVI legend: > / SI - StepIterator > SI4 TCI - TypedChildIterator > / \ NVI - NodeValueIterator > TCI2 TAI TAI - TypedAttributeIterator > > (for StepIterators / is the _source and \ is the _iterator) > > Consider the following code from bug#5152: > > par<xsl:value-of select="$par"/>par > var<xsl:value-of select="$var"/>var > > As reported in the bug, $par will generate output whereas $var will not. > > In the $par case, BasisLibrary.stringF(Object, DOM) is passed an uncloned > iterator tree, resets it, and uses next() to grab the first value (see code > below): > > return dom.getNodeValue(((NodeIterator)obj).reset().next()); > > In the $var case, a (different) uncloned iterator tree is reset, cloned and > next() is used to grab the first value (see code below): > > int i_5_ = nodeiterator_0_.reset().cloneIterator().next(); > String string = dom_6_.getNodeValue(i_5_); > > It would appear that cloneIterator is causing some problems. First, let me > relate how understand what the thing does: cloneIterator serves to make a > throw-away copy of an iterator. All it's objects and all their states are > copied so that the iterator can then be used (and in the course of being > used, modified). > > cloneIterator works by cloning 'this', then calling 'clone.reset()': reset() > is called after each subtree is cloned. I think I can summarise the > cloning/resetting of the iterator tree in bug #5152 as objects are cloned > (@) and then each clone is reset (#). This stuff happens in the following > order: SI1@ SI2@ SI3@ TCI1@ TCI1# NVI@ SI4@ TCI2@ TCI2# TAI@ TAI# SI4# NVI# > SI3# TCI3@ TCI3# SI2# TCI4@ TCI4@ SI1#. > > Notice that after TAI# we are unwinding from the deepest part of the > callstack. Notice also that as we unwind from SI4# we then execute NVI#. > >From looking at the code, we see that NodeValueIterator.cloneIterator() is > (mostly) implemented as (_source is object SI4): > > NodeValueIterator clone = (NodeValueIterator)super.clone(); > clone._source = _source.cloneIterator(); > return clone.reset(); > > and that even-though _source.cloneIterator performs a reset, since > NodeValueIterator.reset is implemented as: > > _source.reset(); > return resetPosition(); > > clone.reset() will call SI4's reset function again. Thus it appears that > every cloned object reset()s itself and then if it's not the root of the > iterator tree, when it's parent is cloned it's reset() called again > (setStartNode(int) would be called if the object is an _iterator of a > StepIterator). > > Looking at the example above, it appears that by the time the root is cloned > and told to reset(), TCI2 has been told to reset/set-its-starting-node as > many times as the tree is deep minus one (5 times). > > Besides being a performance issue, I'll wager that this mutltipe-resetting > behaviour is the source of problem for bug #5152. I'll claim that the > iterator tree above appears to break during the _second_ initialization of > TCI3 (ie: during SI1#). Starting from SI4#, I'll outline the codepath: > > 1. During SI4#, TAI._attribute is correctly set but TCI2.next moves to 0 > 2. SI2# calls SI2.reset (by way of clone.reset) > 3. SI2.reset calls SI3.next to initialize TCI3 (and succeeds), and during > the call TCI1.next moves to 0 and TAI._attribute is consumed (it is now 0 as > well). > 4. SI1# calls SI2.reset (by way of SI1.reset) > 5. SI2.reset calls SI3.next to initialize TCI3 > 6. SI3.next fails to get a node from either _iterator (by way of SI4.next > failing - itself failing from TAI and TCI2 being 0) or _source (TCI1 has no > more nodes) > > Since SI3.next returned 0, TCI3 is set to 0, SI2.next returns 0, TCI4 is set > to 0 and SI1.next returns 0 thereafter. > > Where I see things falling apart is in SI3.next (StepIterator.javaV1.9L140) > during SI2#. Could a fix be to check (during cloning only) if _source came > to END then _source could be reset and _iterator was reinitialized? This > would be a pretty ugly patch. > > Perhaps another way would be to have a final method in NodeIteratorBase that > would start the cloning of the iterator tree and then reset it - once only. > Like the way SyntaxTreeNode.parseContents/Children works - uh, only > different. If the main entrypoint was still called cloneIterator then only > the names of the iterator methods that perform the cloning would need to > change - nothing in the compiler-proper. > > Thoughts? Please straighten me out if things look crooked, > > john
