Ah OK... thanks for the reply and sorry for being slow in replying :-)

I was curious because I was always under the impression that Stream/SAX based XSLT could (in theory) handle messages of any size (huge messages). I'd never thought about the full random access aspect. Anyway, I tried it with Xalan, Saxon and the default in JDK 1.5 and non of them managed the 100 Mb file (out-of-the-box -- I didn't want to go playing with VM memory setting). I thought I might be doing something wrong.

So my interest in this stems from a Codehaus project that I am the lead on. The main component there is a tool called Smooks (http://milyn.codehaus.org/Smooks), which allows you to filter a range of different message input formats (XML, EDI, Java etc) and target "Visitor" logic at message the fragments to produce a range of different message output formats (XML, EDI, Java etc) . Among other things, it has support for fragment based templating using XSLT and a few other templating solutions (FreeMarker etc). So, it actually does something similar to your suggested "subtree through Xalan" approach wrt XSL (but only via DOM - not SAX). Using the SAX filter, we've also managed to do certain types of processing on huge messages (GBs) i.e. complex splitting, routing, persistence.

Thanks again for your reply :-)

Regards,

Tom.


[EMAIL PROTECTED] wrote:
In general, XSLT -- because it has full random access to the source document -- needs acces to the whole input at once, which means it has trouble with huge documents. There have been attempts at "streaming" processors, but they generally handle only limited subsets of the language, and Xalan doesn't have much (if any) capability in that area.

For more detailed background, check the archives of this mailing list; search for discussion of "streaming" and "pruning".

Depending on the exact characteristics of your problem, the simplest solution may be to rewrite it as a custom SAX application, possibly passing selected subtrees through Xalan. Or you might want to check whether your needs fit into the range where one of the existing limited-streaming processors (such as the one in the Datapower product) could handle it.

______________________________________
"... Three things see no end: A loop with exit code done wrong,
A semaphore untested, And the change that comes along. ..."
  -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish (
http://www.ovff.org/pegasus/songs/threes-rev-11.html)

Reply via email to