You may want to search the archives of this mailing list for use of the word "pruning". We're very aware that there's an opportunity for improvement in storage management. There are some challenges in finding a clean way to implement that storage recovery given the characteristics of our current data model (DTM, not DOM), and larger challenges in recognizing when data really won't ever again be referenced (remember, XSLT allows searching previous/parent axes, so in the general case we have to assume that everything MAY have to be retained). Open area of research.
If you insist on trying to do this in today's Xalan: If you tell Xalan to discard whitespace, it doesn't build those nodes into the data model; that may reduce your storage somewhat. Definitely use SAX input; we're more efficient when processing SAX than when reading from a DOM. If your problem was an extraction rather than an insertion, I'd recommend turning on incremental model construction, which would allow us to stop building the model once we've found what you're looking for (not always a help, depending on where the data is in your document, but it at least improves the odds). If all else fails, most JVMs will let you raise you maximum heap size, though they may not let you increase it enough to handle these docs. But until we have pruning working, I would recommend coding simple insertions such as the one you describe at the SAX level rather than in XSLT. Different tools are optimized for different tasks, and this isn't one which Xalan is currently set up to handle well... though we agree that we want it to do so in the future. ______________________________________ Joe Kesselman / IBM Research
