ArrayIndexOutOfBounds

Joseph_Kesselman Wed, 31 Oct 2001 12:11:52 -0800


DTM does currenly have a hardcoded maximum number of nodes; documents above
that size will fail. The DTM Manager also currently has a hardcoded maximum
number of documents it can manipulate simultaneously. Exceeding those
limits will fail.



For the first problem (document size limits), the short-term workaround is
to subdivide your documents into smaller files. Note that subdividing them
too much risks running into the number-of-documents problem.

In the longer term: I've been working on some code to allow us to discard
portions of the document model which aren't needed, and to recover some of
those node handles. It isn't complete enough to check in even as an
experimental branch, and I suspect that the current solution isn't the
right one.

There has also been some discussion of possibly increasing the node handle
size from 32 bits to 64, and allocating more bits (probably 32) to node
addressing. This would be an extremely expensive  change; the alterations
to Xalan would be pervasive and not easy to isolate. We _really_ don't want
to think about this until after our next general (as opposed to
developer's) release; there's just too much risk of destabilizing the whole
system.


For the second problem: There is a short-term workaround in the
implementation of <xsl:for-each> that may be useful. _IF_ the select
pattern is rooted on a document() call, adding the Procesing Instruction
<?xalan:doc-cache-off?> within the body of the <xsl:for-each> serves as a
hint to Xalan that the document so loaded may be discarded after each pass,
rather than being retained in the DTMManager's cache. This partly addresses
the case where -- as suggested above -- you're scanning through a series of
documents representing chapters of a larger composite document.

Possible long-term solutions here include making Xalan's cache smarter
about when it can discard documents without such hints, and -- if and when
we're ready to consider it -- increasing the size of the node handle
datatype so more bits are available for document selection.

Re: big files / No more DTM IDs / ArrayIndexOutOfBounds

Reply via email to