>Xalan/XLST builds an in memory representation of the entire document. In incremental mode, Xalan builds a model of as much of the document as is actually referenced. If your stylesheet only looks at the first half, the second half never gets loaded. This may help a bit... but since XSLT/XPath could at any point ask to look at earlier data again, the basic processing model does assume that it's all in memory at once.
See the archives of this list, and the developers' list, for discussion of "streaming" and "pruning". We do want to extend Xalan to allow it to recognize when information is no longer needed and avoid keeping it in memory, but that is an ongoing research effort. As Christopher said, you may find that for problems of the sort you describe -- very large documents, very simple filtering transformations -- it's worth hand-coding a SAX-based solution.
