Re: XPath does not renurn text nodes after CDATA section

Henry Zongaro Tue, 27 Nov 2007 07:05:34 -0800

Hi, Alexander.

Alexander Stepochkin <[EMAIL PROTECTED]> wrote on 2007-11-27 
03:53:26 AM:
> The following code:
> 
> 
>         Document context = DomHelper.parseString(
>             "<?xml version=\"1.0\"?>"
>             + "\n<root>"
>             + "\n11111111"
>             + "\n<![CDATA[cdata text]]>"
>             + "\n22222222"
>             + "\n</root>"
>         );
>         NodeIterator ni = XPathAPI.selectNodeIterator(context,
> "/*/node()", context);
>         Node n;
>         while ((n = ni.nextNode()) != null) {
>             System.out.println(n.getNodeValue());
>         }
> 
> 
> shows only:
> 11111111


I haven't looked at this in a debugger, but I believe what is happening is 
a symptom of a mismatch between the XPath data model and the DOM.  The DOM 
Element node named "root" in this case has three DOM Text nodes.  However, 
the XPath data model does not permit two text nodes in a tree to be 
adjacent - the XPath data model views those three DOM Text nodes as a 
single XPath text node.

The XPathAPI class returns nodes from the original DOM tree that 
correspond to the nodes selected from the XPath data model instance.  If 
the node to be returned is an XPath text node that represents multiple 
adjacent DOM Text nodes, only the first such DOM Text node will be 
returned.

This same issue comes up in the DOM L3 XPath API, as described in [1]. The 
specification for that API suggests using the DOM L3 Core 
Text.getWholeText() method to retrieve the text associated with several 
adjacent DOM Text nodes.  Of course, you need to be using a DOM 
implementation that supports the DOM L3 Core APIs.

Thanks,

Henry
[1] 
http://www.w3.org/TR/2004/NOTE-DOM-Level-3-XPath-20040226/xpath.html#TextNodes
------------------------------------------------------------------
Henry Zongaro      XSLT Processors Development
IBM SWS Toronto Lab   T/L 313-6044;  Phone +1 905 413-6044
mailto:[EMAIL PROTECTED]

Re: XPath does not renurn text nodes after CDATA section

Reply via email to