Here is my actual code... Notice that the next Node is set to ^Lhello.
The DOM creates the following document...
<?xml version="1.0" encoding="UTF-8"?>
<mytag>hello &#12;</mytag>

Then, when it goes to parse it I get

It seems unintuitive to me why the DOM would go to the effort to
transform my character if it can't read it. Shouldn't any document the
parse creates be parsable?


org.xml.sax.SAXParseException: Illegal XML character &#xc;
        at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3339)
        at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3333)
        at
org.apache.crimson.parser.Parser2.surrogatesToCharTmp(Parser2.java:2640)
        at
org.apache.crimson.parser.Parser2.maybeReferenceInContent(Parser2.java:2
574)
        at org.apache.crimson.parser.Parser2.content(Parser2.java:1980)
        at
org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1654)
        at
org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:634)
        at org.apache.crimson.parser.Parser2.parse(Parser2.java:333)
        at
org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:448)
        at
org.apache.crimson.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.ja
va:185)
        at
javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:76)
        at XMLStuff.<init>(XMLStuff.java:45)
        at XMLStuff.main(XMLStuff.java:117




import org.w3c.dom.Document;
import org.w3c.dom.Text;
import org.w3c.dom.Element;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.FactoryConfigurationError;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.*;
import java.io.*;

/**
 * Created by IntelliJ IDEA.
 * User: rmg1
 * Date: Nov 18, 2004
 * Time: 3:12:27 PM
 * To change this template use Options | File Templates.
 */
public class XMLStuff
{
    public XMLStuff()
    {
        try
        {
            DocumentBuilder docbuilder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
            Document doc = docbuilder.newDocument();

            Element elm = doc.createElement("mytag");
            byte[] funnybyte = new byte[1];
            funnybyte[0]= 0x0c;
            String funnytring = new String(funnybyte, "ASCII");
            Text text = doc.createTextNode("hello " + funnytring);
            elm.appendChild(text);
            doc.appendChild(elm);
            System.out.println("Here is a doc ");
            printXML(doc, System.out);

            ByteArrayOutputStream ostream = new ByteArrayOutputStream();
            printXML(doc, ostream)  ;
            DocumentBuilder docbuilder2  =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
            Document doc2 = docbuilder2.parse(new
ByteArrayInputStream(ostream.toByteArray()));
            printXML(doc2,System.out);
        } catch (ParserConfigurationException e)
        {
            e.printStackTrace();  //To change body of catch statement
use Options | File Templates.
        } catch (UnsupportedEncodingException e)
        {
            e.printStackTrace();  //To change body of catch statement
use Options | File Templates.
        } catch (SAXException e)
        {
            e.printStackTrace();  //To change body of catch statement
use Options | File Templates.
        } catch (IOException e)
        {
            e.printStackTrace();  //To change body of catch statement
use Options | File Templates.
        }

    }

    public void printXML ( Document doc, OutputStream stream)
    {
        Transformer tf = null;
        try
        {
            tf = TransformerFactory.newInstance().newTransformer();
        } catch (TransformerConfigurationException e)
        {
            e.printStackTrace();  //To change body of catch statement
use Options | File Templates.
        } catch (TransformerFactoryConfigurationError
transformerFactoryConfigurationError)
        {
            transformerFactoryConfigurationError.printStackTrace();
//To change body of catch statement use Options | File Templates.
        }
        tf.setOutputProperty( OutputKeys.INDENT, "yes" );
 
tf.setOutputProperty("{http://xml.apache.org/xslt}indent-amount";, "2");
        try
        {
            tf.transform(new DOMSource(doc), new StreamResult(stream));
        } catch (TransformerException e)
        {
            e.printStackTrace();  //To change body of catch statement
use Options | File Templates.
        }

    }



    public static void main(String[] argv)
    {
        XMLStuff stt = new XMLStuff();
    }
}

-----Original Message-----
From: Joseph Kesselman [mailto:[EMAIL PROTECTED] 
Sent: Monday, November 22, 2004 9:55 AM
To: [EMAIL PROTECTED]
Subject: Re: Java cannot parse what the DOM creates??





I presume you mean &12; -- the trailing semicolon is important.

See the XML Specification's description of "numeric character
references".
This is absolutely standard XML. Any parser *should* accept it unless it
appears in a place where that character is not legal (in the middle of
an
element name, for example).

______________________________________
Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more.
"The world changed profoundly and unpredictably the day Tim Berners Lee
got bitten by a radioactive spider." -- Rafe Culpin, in r.m.filk


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to