The problem with benchmarking XSLT -- like benchmarking any programming language -- is figuring out what kinds of samples are typical, or at least informative, for specific sets of users... and then making sure those tests are themselves essentially honest. I've seen solutions ranging from a taking 25 or so samples of varying degrees of normality (the XSLTMARK suite) to simply doing a performance benchmark on our regression tests (which aren't at all typical or evenly distributed but which do exercise most of the system).
(I've used both of those when doing analysis to see whether a change in the code appears to make things better or worse. I'm not sure I'd recommend either as being a good predictor of how any particular stylesheet will perform on any particular source document. But at least they provide some indication of whether we're going in a useful direction.) ______________________________________ Joe Kesselman / IBM Research
