There appears to be a severe memory usage issue when using the EXSLT functions str:tokenize or str:split as included in the latest version of Xalan (2.7.0, and also 2.6.1). Even for relatively small XML files, when repeatedly using these functions, we are encountering OutOfMemory exceptions even with large (2GB) maximum heap sizes. The code for these extension functions looks quite straightforward, so I suspect the problem lies deeper.
I created a simple testcase to demonstrate the problem, please see the details below. In this particular example, I hit another exception (No more DTM IDs are available) before I reach OutOfMemory, but I imagine that if I tweak my example appropriately, I would hit OutOfMemory instead. If I use the stylesheet version of the EXSLT function, I don't encounter this memory issue. I searched the mailing list archives and bug databases, but didn't find any references to this issue. Thanks, Mike test.xml (vary the number of 'row' elements) ======== <?xml version="1.0" encoding="ISO-8859-1"?> <document> <row>1.2.3.4.5.6.7.8.9.0</row> </document> ==================================== test.xsl ======== <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:str="http://exslt.org/strings"> <xsl:template match="row"> <xsl:variable name="test" select="str:tokenize(.,'.')"/> </xsl:template> </xsl:stylesheet> ==================================== Command-line: > java -cp "C:\tmp\xalan-j_2_7_0\xalan.jar;C:\tmp\xalan-j_2_7_0\serializer.jar;C:\t mp\xalan-j_2_7_0\xml-apis.jar;C:\tmp\xalan-j_2_7_0\xercesImpl.jar" -Xmx1024m org.apache.xalan.xslt.Process -IN test.xml -XSL test.xsl -OUT test_out.xml Test 1: test.xml with 1,000 rows; file size: 32KB Result: Transformation completes successfully; maximum process size: ~163MB Test 2: test.xml with 10,000 rows; file size: 312KB Result: Transformation aborts with exception; maximum process size: ~900MB file:///c:/tmp/test.xsl; Line #12; Column #61; XSLT Error (javax.xml.transform.TransformerException): No more DTM IDs are available Exception in thread "main" java.lang.RuntimeException: No more DTM IDs are available at org.apache.xalan.xslt.Process.doExit(Process.java:1153) at org.apache.xalan.xslt.Process.main(Process.java:1126) > java -version java version "1.5.0_04" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_04-b05) Java HotSpot(TM) Client VM (build 1.5.0_04-b05, mixed mode, sharing)
