Title: Performace questions and possible bug

I've been doing some performance testing of Xalan-Java and Xalan-C++ for processing files that range from a few hundred Kbytes to a few hundred Mbytes. For the tests, I used Xalan-J 2.5.1 with JDK 1.4.2_01 and Xalan-C++ 1.5 on a Dual Pentium III PC with 1 GByte of memory running Windows 2K Professional.

I'm a bit surprised with the results as Xalan-C++ performance is linear with respect to the XML input size while Xalan-J performance is exponential.

To give a bit more context, the kind of transformations we're mostly interested are flattening XML into relational structures. The attached ZIP contains three stylesheets that extract data out of the input XML document at different nesting levels and a few sample documents along with an Excel spreadsheet that details the tests results.

The structure of the input documents looks like:

<?xml version="1.0"?>

<customers>

  <customer id="0" name="Acme, Inc.">

    <orders>

      <order order_no="0">

        <items>

          <item item_no="12" quantity="260" />

          ...

        </items>

      </order>

      ...

    </orders>

    <addresses>

      <address street="645 Lake Blvd." city="Boston" state="MA" zip="01011" />

      ...

    </addresses>

  </customer>

</customers>

Some statistics:

-       All documents contain 50 customer elements
-       The count of order elements ranges from 1000 to 441439
-       The count of item elements ranges from 2960 to 1323687
-       The number of address elements is almost constant around 100 instances

and the three transformations extract:

-       The addresses of a customer
-       The orders of a customer
-       The items of an order

In all three tests (Xalan-Java, XSLTC and Xalan-C++) I'm sending to output to the std out and redirecting the results to a file.

I tested using both the interpreted version of the XSLT processor and XSLTC and the results are very similar although XSLTC performs a little better as the size of the input increases. As far as java is concerned, I had to increase the maximum java heap size to 1 GByte (-Xmx option). I also played a little with the initial heap size (-Xms option) and got some improvement but as the size of input file approached the upper end of the tests performance degraded dramatically (the results are included in the attached spreadsheet). One interesting detail I got using the -Xprof profiling option of java is that the java.util.Vector.ensureCapacityHelper method seems to be taking most of the execution time (anywhere from 40 to 87% as the size of the file increases).

I'm interested in getting comments from other people about their experience with performance. Is this behavior typical of the kind of transformation I'm doing?

Additionally, I had a problem using the translet that extracts all item elements. Starting with a document that contains 296380 item elements the transformation aborted with a "Translet errors:No more DTM IDs are available" error. I looked through the FAQ and mailing lists and didn't find anything about this apart from an issue that existed in previous versions of Xalan-J that is no longer present in version 2.5.1.

My environment is as follows:

#---- BEGIN writeEnvironmentReport($Revision: 1.20 $): Useful stuff found: ----

version.DOM.draftlevel=2.0fd

java.class.path=d:/xalan-j_2_5_1/bin/xalan.jar;d:/xalan-j_2_5_1/bin/xml-apis.jar;d:/xalan-j_2_5_1/bin/xercesImpl.jar;.;d:/j2sdk1.4.2_01/lib;d:/j2sdk1.4.2_01/jre/lib

version.JAXP=1.1 or higher

java.ext.dirs=d:\j2sdk1.4.2_01\jre\lib\ext

#---- BEGIN Listing XML-related jars in: foundclasses.sun.boot.class.path ----

xalan.jar-path=d:\j2sdk1.4.2_01\jre\lib\endorsed\xalan.jar

xercesImpl.jar-apparent.version=xercesImpl.jar from xalan-j_2_5_0 from xerces-2_4

xercesImpl.jar-path=d:\j2sdk1.4.2_01\jre\lib\endorsed\xercesImpl.jar

xml-apis.jar-apparent.version=xml-apis.jar present-unknown-version

xml-apis.jar-path=d:\j2sdk1.4.2_01\jre\lib\endorsed\xml-apis.jar

#----- END Listing XML-related jars in: foundclasses.sun.boot.class.path -----

version.xerces2=Xerces-J 2.4.0

version.xerces1=not-present

version.xalan2_2=Xalan Java 2.5.1

version.xalan1=not-present

version.ant=not-present

java.version=1.4.2_01

version.DOM=2.0

version.crimson=present-unknown-version

sun.boot.class.path=d:\j2sdk1.4.2_01\jre\lib\endorsed\xalan.jar;d:\j2sdk1.4.2_01\jre\lib\endorsed\xercesImpl.jar;d:\j2sdk1.4.2_01\jre\lib\endorsed\xml-apis.jar;d:\j2sdk1.4.2_01\jre\lib\rt.jar;d:\j2sdk1.4.2_01\jre\lib\i18n.jar;d:\j2sdk1.4.2_01\jre\lib\sunrsasign.jar;d:\j2sdk1.4.2_01\jre\lib\jsse.jar;d:\j2sdk1.4.2_01\jre\lib\jce.jar;d:\j2sdk1.4.2_01\jre\lib\charsets.jar;d:\j2sdk1.4.2_01\jre\classes

#---- BEGIN Listing XML-related jars in: foundclasses.java.class.path ----

xalan.jar-path=d:\xalan-j_2_5_1\bin\xalan.jar

xml-apis.jar-apparent.version=xml-apis.jar present-unknown-version

xml-apis.jar-path=d:\xalan-j_2_5_1\bin\xml-apis.jar

xercesImpl.jar-apparent.version=xercesImpl.jar from xalan-j_2_5_0 from xerces-2_4

xercesImpl.jar-path=d:\xalan-j_2_5_1\bin\xercesImpl.jar

#----- END Listing XML-related jars in: foundclasses.java.class.path -----

version.SAX=2.0

version.xalan2x=Xalan Java 2.5.1

#----- END writeEnvironmentReport: Useful properties found: -----

# YAHOO! Your environment seems to be OK.

Thanks,

Hernando Borda

Software Developer

Ascential Software Corp.

<<perf.ZIP>>

<<attachment: perf.ZIP>>

Reply via email to