andreas writes:

> when constructing an XHTML document with Xerces-J, how to insert
> special characters like " "? The problem is that the "&"
> character is replaced by the """ sequence, thus " " is
> expanded to ""nbsp;".
> 
> There must be a way to insert special characters, but I have not
> been able to figure it out.

Unicode. Java has a special escape sequence that allows the author to
insert unicode characters directly into the source file.

   http://mindprod.com/jgloss/unicode.html

> import java.io.IOException;
> import org.apache.html.dom.HTMLDocumentImpl;
> import org.apache.html.dom.HTMLHeadingElementImpl;
> import org.apache.html.dom.HTMLParagraphElementImpl;
> import org.apache.xerces.dom.TextImpl;
> import org.apache.xml.serialize.OutputFormat;
> import org.apache.xml.serialize.XHTMLSerializer;
> import org.w3c.dom.html.HTMLElement;
> import org.w3c.dom.html.HTMLHeadingElement;
> 
> class XHtmlProblem
> {
>   static public void main( String[] args )
>     throws IOException
>   {
>     HTMLDocumentImpl document = new HTMLDocumentImpl();
> 
>     document.setTitle( "XHTML ampersand problem" );
> 
>     HTMLElement body = document.getBody();
> 
>     HTMLHeadingElement heading = (HTMLHeadingElement) new 
> HTMLHeadingElementImpl( document, "h1" );
> 
>     body.appendChild( heading );
> 
>     heading.appendChild( new TextImpl( document, "XHTML ampersand problem" ) 
> );
> 
>     HTMLElement paragraph = new HTMLParagraphElementImpl( document, "p" );
> 
>     body.appendChild( paragraph );
> 
>     paragraph.appendChild( new TextImpl( document, "Belongs together." ) 
> );

  is somewhat special in that it is a named character entity (or
something like that - please correct me someone) and is defined in
some SGML definition file that essentially says "if you see  ,
replace it internally with SOME_CHARACTER" where SOME_CHARACTER is the
numeric value of a character - its just an easy way to represent an
encoded character.

I've used this page in the past to map named character entities to
their unicode equivalent:

          http://www.w3.org/TR/MathML2/bycodes.html

And your code should work with:

  paragraph.appendChild( new TextImpl( document, "Belongs\u00A0together." ) );

where   is replaced with \u00a0 .

>     paragraph = new HTMLParagraphElementImpl( document, "p" );
> 
>     body.appendChild( paragraph );
> 
>     paragraph.appendChild( new TextImpl( document, 
> "Quoting\\ doesn't\\\\&help." ) );

  paragraph.appendChild( new TextImpl( document, 
"Quoting\u00a0doesn't\\\\&help." ) );

>     XHTMLSerializer serializer = new XHTMLSerializer( System.out, new 
> OutputFormat() );
> 
>     serializer.serialize( document );
>   }
> }

Elizabeth

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to