RE: Memory Consumption for Large XML Transformation (50MB to 1GB) with SAX

Barbara Samson 8 Dec 2003 22:44:36 -0000

An XSLT processor needs to keep the whole document in memory because the
template for any node can reference nodes before and after it. So there
will always be an upper limit to how large an XML document you can
process with XSLT. If you can limit the references between nodes, you
should be able to process the document as a stream through a SAX
handler. I would use one pass through a SAX handler to extract the
elements of interest, then pass those elements through an XSLT
processor. Apache Cocoon offers a good model for this kind of "pipeline"
processing.

Barbara

-----Original Message-----
From: Rajesh Raheja [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 08, 2003 1:14 PM
To: xalan-c-users@xml.apache.org; [EMAIL PROTECTED]
Subject: Memory Consumption for Large XML Transformation (50MB to 1GB)
with SAX

We are trying to transform very large XML documents (30MB to 1GB in
size) and were planning on using an XSLT engine for it.  Our tests
showed that passing in the document in DOM typically crashed with out of
memory errors.

however, even with passing in SAX events, the memory consumption was
around FIVE times the document size (e.g. the 50MB document input
consumed 250MB of the jvm).

would appreciate any inputs on any way to improve the memory consumption
and more generally - Is XSLT the way to go for such large documents?
what are the alternatives (btw, we tried asking customer to reduce or
break up the document
- not feasible!)?

Thanks
Rajesh

__________________________________
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/

RE: Memory Consumption for Large XML Transformation (50MB to 1GB) with SAX

Reply via email to