DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=7368>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=7368 Text Nodes are split randomly Summary: Text Nodes are split randomly Product: XalanJ2 Version: 2.3Dx Platform: PC OS/Version: Windows NT/2K Status: NEW Severity: Normal Priority: Other Component: Xalan AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] We use the Xerces 2.0.1 parser and Xalan 2.3.1 transformer. We create a DOM result like: TransformerFactory tFactory = TransformerFactory.newInstance(); Transformer transformer = tFactory.newTransformer(new StreamSource (definitionSource)); DOMResult domResult = new DOMResult(); transformer.transform(new StreamSource(extractSource), domResult); Document document = (Document) domResult.getNode(); extractRootNode = document.getDocumentElement(); Then we operate on the extractRootNode. The problem is as follows. Sometimes, and this seems to occur quite randomly, an element only having text content will have TWO, not one, Text Node children. The DOM 2 Core specification says: If there is no markup inside an element's content, the text is contained in a single object implementing the Text interface that is the only child of the element. So the behaviour of either Xerces or Xalan is not according to the spec. The problem is totally reproducible, but only occurs with certain XML files. In addition, on different machines the splitting occurs in different parts of the same XML file! There is a workaround -- merge the Text Node contents when operating on the tree -- but this is not convenient; others are virtually bound to experience this bug also. The problem is new to 2.x, apparently; we didn't have it in 1.x. For more information, please contact me at my e-mail address.
