Poor performance transforming from SAX to DOM with large text content ---------------------------------------------------------------------
Key: XALANJ-2530 URL: https://issues.apache.org/jira/browse/XALANJ-2530 Project: XalanJ2 Issue Type: Improvement Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Components: JAXP Affects Versions: 2.7.1 Environment: java version "1.6.0_23" Java(TM) SE Runtime Environment (build 1.6.0_23-b05) Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode) Linux 2.6.34.7-66.fc13.x86_64 #1 SMP Wed Dec 15 07:04:30 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux Reporter: Steve Jones Xalan performs poorly when transforming a SAX source to a DOM result when the input contains large amounts of contiguous text. The following test shows that Xalan takes 45 seconds to process the test document, but the "Sun" JDK transformer takes under half a second. // Generate XML with large text content final int bufferSize = 1024*1024*5; final StringBuilder stringBuilder = new StringBuilder(bufferSize); stringBuilder.append( "<test-document>" ); for ( int i=0; i< 1000000; i++ ) { stringBuilder.append( "text " ); } stringBuilder.append( "</test-document>" ); final String testDocument = stringBuilder.toString(); System.out.println( "Test document size : " + testDocument.length() + "/" + bufferSize ); // Process it //System.setProperty( "javax.xml.transform.TransformerFactory", "com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl" ); final javax.xml.transform.Transformer transformer = javax.xml.transform.TransformerFactory.newInstance().newTransformer( ); final javax.xml.transform.sax.SAXSource source = new javax.xml.transform.sax.SAXSource( new org.xml.sax.InputSource( new java.io.StringReader( testDocument ) ) ); final javax.xml.transform.dom.DOMResult result = new javax.xml.transform.dom.DOMResult(); final long startTime = System.currentTimeMillis(); transformer.transform( source, result ); System.out.println( ( System.currentTimeMillis() - startTime ) + "ms" ); It could be argued that this is a DOM implementation issue (due to the poor performance of CharacterData.appendData), but it seems easy to fix within Xalan. The "Sun" JDK solution to this issue can be seen in the class: com.sun.org.apache.xalan.internal.xsltc.trax.SAX2DOM which uses a StringBuilder to buffer the character data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: xalan-dev-unsubscr...@xml.apache.org For additional commands, e-mail: xalan-dev-h...@xml.apache.org