DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12105>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12105

UTF Encoding is not preserved





------- Additional Comments From [EMAIL PROTECTED]  2003-07-25 13:20 -------
Hi Denis.  The correctly encoded value for � in a HTML URI attribute value 
really is %C3%A9.

According to Section 16.3 of XSLT 1.0 [1]:

  The html output method should escape non-ASCII characters in URI attribute
  values using the method recommended in Section B.2.1 of the HTML 4.0
  Recommendation.

And according Section B.2.1 of HTML 4.0

  We recommend that user agents adopt the following convention for handling non-
  ASCII characters in such cases:

  Represent each character in UTF-8 (see [RFC2279]) as one or more bytes.

  Escape these bytes with the URI escaping mechanism (i.e., by converting each
  byte to %HH, where HH is the hexadecimal notation of the byte value). 

The UTF-8 encoding of � is the two-byte sequence is 0xC3 0xA9.  If your browser 
is not able to resolve that URI reference, it may be a bug in your browser.

You can completely disable the escaping of URI's by specifying the
xalan:use-url-escaping attribute with the value "no" on xsl:output, but I'm not 
sure whether that will have the effect you want.

[1] http://www.w3.org/TR/xslt
[2] http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.1

Reply via email to