[ https://issues.apache.org/jira/browse/XALANJ-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527646 ]
SHEREEF ABDULLA commented on XALANJ-2398: ----------------------------------------- These classes are now part of jre (rt.jar) and I seen that the sun implementation is just the copy of apache classes.. I observed that the reading part is done by XMLEntityScanner.load(), and it came there through the following root : XMLDocumentFragmentScannerImpl.scanCDATASection-> XMLEntityScanner.scanData() -> XMLEntityScanner.load(). in the XMLEntityScanner.scanData() it loads the xml as small chunks and calls the AbstractSAXParser.characters() which intern calls the Textimpl.appendData .. > parsing big XMLs take very long time. A JAXRPC webservice request of soap > size 2MB takes above 5 mins to complete parsing and start processing web > service. > ----------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: XALANJ-2398 > URL: https://issues.apache.org/jira/browse/XALANJ-2398 > Project: XalanJ2 > Issue Type: Bug > Components: XSLTC > Environment: Operating System: All > Platform: All > Reporter: SHEREEF ABDULLA > Attachments: JaxRPCProcessRequest.java, WrkingBigSOAP.txt > > > parsing bix XMLs take very long time. A JAXRPC webservice request of soap > size 2MB takes above 5 mins to complete parsing and start processing web > service. > Observed that the SOAP messages are read line by line and > CharacterDataImpl.append() is getting called for each line. This appends each > line to the string data which stores the previously read xml part. This call > results in many string additions (say 20000+ for 1MB SOAP) of big string and > the thread is most of the time blocked in StringBuilder.expandcapacity due to > long string additions. > JAX RPC webservice with SOAP messages bigger than 1MB take 5 mins and more > just for the web service implementation to start working as the jaxrpc does a > SOAPMessage.getenvelop call to do HandlerChainImpl.checkMustUnderstand() > checking for the request message header. Same problem happens for response > also. for time being we commented checkMustUnderstand method so that the > parsing doesn't happen at all. > String additions for each of the lines would have been avoided, either by > creating the whole data single time or using string buffer instead of string. > I tried to modify the data field to use StringBuffer instead of String but > the underlying CoreDocumentImpl.modifiedCharacterData() and all underlying > calls takes the string params so couldn't go ahead with it. > bug # XERCESJ-102 looks like the same issue.. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]