I have been doing some performance analysis this week comparing Xalan 
2.1 vs. 2.2 because we have been unable to use version 2.2 because it is 
so much slower.  I thought I would share what I have found so far in 
case it helps.

First of all, when testing just a single transformation at a time on a 
machine with plenty of RAM to hold everything in memory, the two 
versions perform about the same.   Version 2.1 performed slightly better 
in all the test cases I tried except when the stylesheet was fairly 
trivial and input XML was large, in which case 2.2 performed slightly 
better.  But in either case the numbers were close.

The big difference between the two versions is the amount of memory used 
during a transformation, and this is what is bringing our server to its 
knees because with multiple concurrent transformations, the amount of 
memory consumed is many times what it was with 2.1.  I was able to 
reproduce the behavior we are experiencing using a single stylesheet.

The test I used to analyze this behavior is a single, fairly complex 
stylesheet that is compiled only once and used to transform an XML 
document with a single <page/> element.  I ran the exact same 
transformation 10 times in succession using the cached compiled 
stylesheet, and dumped out the memory consumed by the JVM after each 
transformation.  I found that after a few iterations, the memory used by 
the JVM is double what it was with v2.1.0.  I then ran the same tests 
but ran the garbage collector before each iteration.  This had no effect 
on v2.1.0, but it improved the v2.2 numbers, which would indicate that 
much of the problem is due to a lot more memory being allocated during 
the transformation, but is eventually freed.  In between garbage 
collection runs, the amount of used memory becomes very large.  I used 
JDK1.2.2 for the test.  JDK1.3 core dumps, so I could not test it.  Here 
are the numbers:

    JVM (1.2.2) Memory Consumption (Kbytes)
     v2.1.0   v2.2.D13   v2.1 w/GC  D13 w/GC
----------------------------------------------
1     5680     7000      5680       6028
2     5680     7000      5680       6028
3     5680     9100      5680       7420
4     5680     9100      5680       7420
5     5680    10256      5680       7420
6     5680    10256      5680       7420
7     5680    10256      5680       7420
8     5680    11280      5680       7420  <---------
(numbers remain the same after the 8th iteration)

I am continueing my investigation by looking at the objects that are 
being allocated during the transformation to find out where the 
differences are.  The objects that are consuming most of the memory are 
listed below with the amount of allocation of each listed for each 
version after 6 iterations.  This does not take into account free 
objects, just total allocations.

   Major Object Allocations  (Kbytes)
                   v2.1.0    v2.2.D13
------------------------------------------
char[]             4142      7153
int[]               825      3326
Object[]           1756      2220
AttributeIterator  1125      1878
byte[]              769      1463

So far I have only analyzed the first one, the char[] objects.  In v2.1, 
most of the allocation comes from the StringBuffer class, but in v2.2, 
70% of the char[] allocations come from the OutputStreamWriter.write() 
method, called from SerializerToXML.accum()

I hope to get more information tomorrow.

Chris




Reply via email to