Poor performance transforming from SAX to DOM with large text content
---------------------------------------------------------------------

                 Key: XALANJ-2530
                 URL: https://issues.apache.org/jira/browse/XALANJ-2530
             Project: XalanJ2
          Issue Type: Improvement
      Security Level: No security risk; visible to anyone (Ordinary problems in 
Xalan projects.  Anybody can view the issue.)
          Components: JAXP
    Affects Versions: 2.7.1
         Environment: java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)

Linux 2.6.34.7-66.fc13.x86_64 #1 SMP Wed Dec 15 07:04:30 UTC 2010 x86_64 x86_64 
x86_64 GNU/Linux

            Reporter: Steve Jones


Xalan performs poorly when transforming a SAX source to a DOM result when the 
input contains large amounts of contiguous text.

The following test shows that Xalan takes 45 seconds to process the test 
document, but the "Sun" JDK transformer takes under half a second.

        // Generate XML with large text content
        final int bufferSize = 1024*1024*5;
        final StringBuilder stringBuilder = new StringBuilder(bufferSize);
        stringBuilder.append( "<test-document>" );
        for ( int i=0; i< 1000000; i++ ) {  stringBuilder.append( "text " ); }
        stringBuilder.append( "</test-document>" );
        final String testDocument = stringBuilder.toString();
        System.out.println( "Test document size : " + testDocument.length() + 
"/" + bufferSize );

        // Process it
        //System.setProperty( "javax.xml.transform.TransformerFactory", 
"com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl" );
        final javax.xml.transform.Transformer transformer = 
javax.xml.transform.TransformerFactory.newInstance().newTransformer( );
        final javax.xml.transform.sax.SAXSource source = new 
javax.xml.transform.sax.SAXSource( new org.xml.sax.InputSource( new 
java.io.StringReader( testDocument ) ) );
        final javax.xml.transform.dom.DOMResult result = new 
javax.xml.transform.dom.DOMResult();

        final long startTime = System.currentTimeMillis();
        transformer.transform( source, result );
        System.out.println(  ( System.currentTimeMillis() - startTime ) + "ms" 
);

It could be argued that this is a DOM implementation issue (due to the poor 
performance of CharacterData.appendData), but it seems easy to fix within Xalan.

The "Sun" JDK solution to this issue can be seen in the class:

  com.sun.org.apache.xalan.internal.xsltc.trax.SAX2DOM 

which uses a StringBuilder to buffer the character data.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscr...@xml.apache.org
For additional commands, e-mail: xalan-dev-h...@xml.apache.org

Reply via email to