By refering http://www.w3.org/TR/1998/REC-xml-19980210#charencoding ,
I know that The defaut enconding for XML entities is UTF-8 and
UTF-16,all XML
processor is able to read entities wroten in both UTF-8 and UTF-16 .For
a XML parser that
can read XML entities other than UTF-8 and UTF-16 ,the XML file must
start with an enconding declare which is :
EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' | "'" EncName
"'" )
EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
S ::= (#x20 | #x9 | #xD | #xA)+
Eq ::= S? '=' S?
Shaoping Zhou wrote:
> I noticed that IE5 cannot correctly process personal.xml because it
> cannot recognize the encoding scheme "US-ASCII".After I changed
> "US-ASCII" to "UTF-8", it worked under IE5. My sample program had the
> same experience, basically it was able to parse the xml data file
> personal.xml after I changed encoding to UTF-8. The listing of the
> code is as follows: public static void main(String[] args) {
> ParserSample1 parserSample1 = new ParserSample1();
> parserSample1.invokedStandalone = true;
> String xmlFile = "F:\\xercesJ\\xerces-1_0_0\\data\\personal.xml";
> DOMParser parser = new DOMParser(); try {
> parser.parse(xmlFile); } catch (SAXException se) {
> se.printStackTrace();
> } catch (IOException ioe) {
> ioe.printStackTrace();
> }
> // The next line is only for DOM Parsers
> Document doc = parser.getDocument(); Node myNode = null;
> // work with element
> Element myElement = doc.getDocumentElement();
> NodeList myNodeList myNodeList = myElement.getChildNodes();
> System.out.println("NodeList length = " + myNodeList.getLength());
>
> for (int i = 0; i < myNodeList.getLength(); i++)
> {
> myNode = myNodeList.item(i);
> System.out.println(myNode.getNodeName());
> System.out.println(myNode.getNodeValue());
> } } I am fairly new to the XML stuff, could someone point out
> what is going on? regards,-Shaoping Zhou