The DTM model can indeed run out of space if a source document is too
large. Currently we allocate 20 bits for node addressing, so we can handle
about a million nodes. Beyond that we'll fail, probably nondiagnostically.
We're considering increasing that, though doing so will impose tighter
limits on how many documents a given DTMManager can keep track of
simultaneously. We could remove the restriction by switching to longs as
our basic node ID type, but that would increase storage and processing
The size of _text_ nodes shouldn't be a problem... I think.
Could you show us a complete stack trace of the exception, so we can
investigate where the problem is occurring?