andreas writes: > when constructing an XHTML document with Xerces-J, how to insert > special characters like " "? The problem is that the "&" > character is replaced by the """ sequence, thus " " is > expanded to ""nbsp;". > > There must be a way to insert special characters, but I have not > been able to figure it out.
Unicode. Java has a special escape sequence that allows the author to insert unicode characters directly into the source file. http://mindprod.com/jgloss/unicode.html > import java.io.IOException; > import org.apache.html.dom.HTMLDocumentImpl; > import org.apache.html.dom.HTMLHeadingElementImpl; > import org.apache.html.dom.HTMLParagraphElementImpl; > import org.apache.xerces.dom.TextImpl; > import org.apache.xml.serialize.OutputFormat; > import org.apache.xml.serialize.XHTMLSerializer; > import org.w3c.dom.html.HTMLElement; > import org.w3c.dom.html.HTMLHeadingElement; > > class XHtmlProblem > { > static public void main( String[] args ) > throws IOException > { > HTMLDocumentImpl document = new HTMLDocumentImpl(); > > document.setTitle( "XHTML ampersand problem" ); > > HTMLElement body = document.getBody(); > > HTMLHeadingElement heading = (HTMLHeadingElement) new > HTMLHeadingElementImpl( document, "h1" ); > > body.appendChild( heading ); > > heading.appendChild( new TextImpl( document, "XHTML ampersand problem" ) > ); > > HTMLElement paragraph = new HTMLParagraphElementImpl( document, "p" ); > > body.appendChild( paragraph ); > > paragraph.appendChild( new TextImpl( document, "Belongs together." ) > ); is somewhat special in that it is a named character entity (or something like that - please correct me someone) and is defined in some SGML definition file that essentially says "if you see , replace it internally with SOME_CHARACTER" where SOME_CHARACTER is the numeric value of a character - its just an easy way to represent an encoded character. I've used this page in the past to map named character entities to their unicode equivalent: http://www.w3.org/TR/MathML2/bycodes.html And your code should work with: paragraph.appendChild( new TextImpl( document, "Belongs\u00A0together." ) ); where is replaced with \u00a0 . > paragraph = new HTMLParagraphElementImpl( document, "p" ); > > body.appendChild( paragraph ); > > paragraph.appendChild( new TextImpl( document, > "Quoting\\ doesn't\\\\&help." ) ); paragraph.appendChild( new TextImpl( document, "Quoting\u00a0doesn't\\\\&help." ) ); > XHTMLSerializer serializer = new XHTMLSerializer( System.out, new > OutputFormat() ); > > serializer.serialize( document ); > } > } Elizabeth --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
