Thanks Dave that was helpful.
Are there any other XSLT libraries which parse the xml as the stream and do
not consume that much memory? I have read about SAXON-SA and they claim that
the library supports up to 20 gig xmls.. I will test it shortly. Too bad
it's commercial.. Also there is another commercial implementation from Intel
which is supposed to handle large xmls...

I tried Joost STX library yesterday and it works pretty good, btw..

On Dec 18, 2007 9:09 AM, David Bertoni <[EMAIL PROTECTED]> wrote:

> Anton Khodakivskiy wrote:
> >
> >
> > ---------- Forwarded message ----------
> > From: *Anton Khodakivskiy* <[EMAIL PROTECTED]
> > <mailto:[EMAIL PROTECTED]>>
> > Date: Dec 18, 2007 8:39 AM
> > Subject: slt transforming large XML files 1gig+
> > To: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
> >
> >
> > Hello
> >
> > I'm looking for a generic way to transform large XML files - possibly 1
> > gig and more. As you understand my biggest concerns are memory usage and
> > performance.
> > I just h ave tried the command line tool Xalan.exe and it quite looks
> > like it loads the whole xml - I'm not sure what for, but I expect that
> > it parses the xml in the framework of DOM. Is it possible to use a SAX
> > based xml parser for the XSLT transformation in xalan, or something like
> > this?
> Xalan-C doesn't use the DOM per-se, although it does use a tree
> representation of the input XML.  The differences are primarily related to
> reducing memory usage by implementing a read-only tree, which is all
> that's
> necessary for XSLT processing.
>
> Because the XPath language provides random access to the source tree, most
> XSLT processors use an in-memory representation, rather than trying to do
> streaming processing.  If you can reduce your transformation to a
> streaming
> subset of XPath, you might try STX:
>
> http://www.xml.com/pub/a/2003/02/26/stx.html
>
> >
> > Also I have read that "it's not recommended to use XSLT on big XML
> > files" - haven't found a meaningful explanation though. What do you
> > think about it? Are there any other alternative ways for generic xml
> > transformations which sattisfy my needs (big xmls)?
>
> I think you'll find that Xalan-C's memory footprint for a 1GB XML document
> will be much less than 1GB of memory, although it can vary widely
> depending
> on the document.  In addition, for documents that have a lot of repeated
> text nodes, you can enable pooling of text nodes to further reduce the
> memory footprint of the source tree.
>
> Whether something's "recommended" or not depends on your requirements.  A
> blanket statement like that doesn't reflect every possible set of
> requirements in the real world.
>
> Dave
>

Reply via email to