Hi HolgeR,
> first of all I would like to mention I am not a performance freak. I
think
> the conformance and stability is more important than some percent of
> performance.
Me too...
> I had a look at the 2003 "Sarvega XSLT Benchmark study". There is one
> testcase where Xalan-C (1.5, but also 1.7) reaches only 13% of the
average
> throughput of all other tested processors (xalanj, libxslt, saxon, resin,
> xsltc, xt, msxml, jd). A really bad outliner. The testcase is a
> transformation of one docbook sample document to html using the standard
> docbook stylesheets.
OK, that's really bad. Can you provide more details about what we need to
reproduce this and we'll see what we can do?
> I glimpsed into the code and do some performance measurements on my own.
> Xalan-C will be up to 5times faster for this docbook-transformation, if I
> disable the strip-whitespace-processing. I have done this by modifying
the
> function StylesheetRoot::shouldStripSourceNode to return false all the
time
> (a really radical method and definitly results in wrong results ;^). The
> reason for this performance leak is - in my opinion - the handling of the
> element names given in the stylesheets "xsl:preserve-space",
> "xsl:strip-space". They will be evaluated and scored as full XPaths which
> is an expensive operation.
OK, there are a couple of problems from what I can see. We rely on using
match patterns (not full XPath expression), because that was probably the
easiest way to do it when things were implemented. I've wanted to make
improvements to this code for a while now, but I think we will need to
prioritize it. We also have a bit of an implementation issue, because we
are storing all of these match patterns in the root stylesheet, which is
not correct. I think we can do a staged improvement of this which will
help alleviate some of the performance problems.
> Maybe I am wrong with my analysis, but if I am right, I think one should
> mention this behaviour within the section "What can I do to speed up
> transformations?" of the Xalan-C-FAQ.
Yes, we can mention it, since it can be a performance issue even when with
a good implementation.
Thanks for the post!
Dave