The #CDATA portion of an element is often times broken up into multiple text
nodes that are children of that element.  by only getting the first node,
you're only getting the first text node which is 'President'.  I'm willing
to bet that the second node contains '&' and the third contains 'CEO'.

when getting the text of a node, you should traverse all the children and
get the values of any nodes whose names are #text.

-Joe Polastre  ([EMAIL PROTECTED])
IBM Cupertino, XML Technology Group


----- Original Message -----
From: "Nathan Wang" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, August 01, 2000 7:46 PM
Subject: A problem with Xerces DOMParser


> Hi,
>
> I just want to report a problem and would like to know
> if there's a fix.
>
> I did something like the following:
>     DOMParser domParser = new DOMParser();
>     domParser.parse(strUrl);
>     Document document = domParser.getDocument();
>
>     NodeList nl = document.getElementsByTagName("attr");
>     Node node;
>     node = nl.item(0).getFirstChild();
>     strTitle = node.getNodeValue();
>
> The original/real value for <attr> was
>     President & CEO
> but, I got only the "President" part back in strTitle.
> The '&' was correctly encoded as "&amp;".
>
> I could view the XML correctly with IE.
>
> I really appreciate if you could give me some input.
>
> Thanks,
> Nathan
> ----------------------------------------------
> Nathan Q. Wang           ONI Systems, Inc.
> ----------------------------------------------
>

Reply via email to