[ https://issues.apache.org/jira/browse/XALANJ-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529636 ]
Michael Glavassevich commented on XALANJ-2398: ---------------------------------------------- Brian, the commit I referenced is the one in which SAX2DOM started using CharacterData.appendData() to accumulate text. The current version in Apache calls that method and as I pointed out to Shereef using CharacterData.appendData() to accumulate text can be very inefficient. A StringBuffer will do the job much better (and in fact that's we use the DOM parser in Xerces instead of this method). > parsing big XMLs take very long time. A JAXRPC webservice request of soap > size 2MB takes above 5 mins to complete parsing and start processing web > service. > ----------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: XALANJ-2398 > URL: https://issues.apache.org/jira/browse/XALANJ-2398 > Project: XalanJ2 > Issue Type: Bug > Components: transformation > Environment: Operating System: All > Platform: All > Reporter: SHEREEF ABDULLA > Attachments: JaxRPCProcessRequest.java, WrkingBigSOAP.txt > > > parsing bix XMLs take very long time. A JAXRPC webservice request of soap > size 2MB takes above 5 mins to complete parsing and start processing web > service. > Observed that the SOAP messages are read line by line and > CharacterDataImpl.append() is getting called for each line. This appends each > line to the string data which stores the previously read xml part. This call > results in many string additions (say 20000+ for 1MB SOAP) of big string and > the thread is most of the time blocked in StringBuilder.expandcapacity due to > long string additions. > JAX RPC webservice with SOAP messages bigger than 1MB take 5 mins and more > just for the web service implementation to start working as the jaxrpc does a > SOAPMessage.getenvelop call to do HandlerChainImpl.checkMustUnderstand() > checking for the request message header. Same problem happens for response > also. for time being we commented checkMustUnderstand method so that the > parsing doesn't happen at all. > String additions for each of the lines would have been avoided, either by > creating the whole data single time or using string buffer instead of string. > I tried to modify the data field to use StringBuffer instead of String but > the underlying CoreDocumentImpl.modifiedCharacterData() and all underlying > calls takes the string params so couldn't go ahead with it. > bug # XERCESJ-102 looks like the same issue.. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]