By refering http://www.w3.org/TR/1998/REC-xml-19980210#charencoding ,
I know that The defaut enconding for XML entities is UTF-8 and
UTF-16,all XML
processor is able to read entities wroten in both UTF-8 and UTF-16 .For
a XML parser that
can read XML entities other than UTF-8 and UTF-16 ,the XML file must
start with an enconding declare which is :
     EncodingDecl ::=  S 'encoding' Eq ('"' EncName '"' |  "'" EncName
"'" )

     EncName ::=  [A-Za-z] ([A-Za-z0-9._] | '-')*

     S ::=  (#x20 | #x9 | #xD | #xA)+

     Eq ::=  S? '=' S?


Shaoping Zhou wrote:

> I noticed that IE5 cannot correctly process personal.xml because it
> cannot recognize the encoding scheme "US-ASCII".After I changed
> "US-ASCII" to "UTF-8", it worked under IE5. My sample program had the
> same experience, basically it was able to parse the xml data file
> personal.xml after I changed encoding to UTF-8. The listing of the
> code is as follows:   public static void main(String[] args) {
>     ParserSample1 parserSample1 = new ParserSample1();
>     parserSample1.invokedStandalone = true;
>     String xmlFile = "F:\\xercesJ\\xerces-1_0_0\\data\\personal.xml";
>     DOMParser parser = new DOMParser();     try {
>       parser.parse(xmlFile);     } catch (SAXException se) {
>       se.printStackTrace();
>     } catch (IOException ioe) {
>         ioe.printStackTrace();
>     }
>     // The next line is only for DOM Parsers
>     Document doc = parser.getDocument();     Node myNode = null;
> // work with element
>     Element myElement = doc.getDocumentElement();
>     NodeList myNodeList myNodeList = myElement.getChildNodes();
>     System.out.println("NodeList length = " + myNodeList.getLength());
>
>     for (int i = 0; i < myNodeList.getLength(); i++)
>     {
>       myNode = myNodeList.item(i);
>       System.out.println(myNode.getNodeName());
>       System.out.println(myNode.getNodeValue());
>     }   } I am fairly new to the XML stuff, could someone point out
> what is going on? regards,-Shaoping Zhou

Reply via email to