http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1950
*** shadow/1950 Sat Jul 21 18:46:54 2001
--- shadow/1950.tmp.20340 Sun Jul 22 00:37:50 2001
***************
*** 138,141 ****
in = new InputSource(fis);
and all should be fine.
! Gary
--- 138,183 ----
in = new InputSource(fis);
and all should be fine.
! Gary
!
! ------- Additional Comments From [EMAIL PROTECTED] 2001-07-22 00:37 -------
! Hi Gary,
!
! Thanks for the advice. However, I have tried to use FileInputStream before and
! the BIG5 characters will be lost. As shown below (I added a few lines to print
! out the byte codes of the characters as well).
!
! $ java JaxpTest2 big5.xml testing.xsl
! Transformation ....
!
! And the value of bbb is: ? ? ? ?
! Result in bytes:
! 10 32 32 32 32 32 65 110 100 32 116 104 101 32 118 97 108 117 101 32 111 102 32
! 98 98 98 32 105 115 58 32 63 32 63 32 63 32 63
!
! Whereas if I use an InputStreamReader and specify the attribute disable-output-
! escaping="yes" in the xsl, I would get the following.
!
! $ java JaxpTest2 big5.xml testing2.xsl
! Transformation ....
!
! And the value of bbb is: » ´ä ©~ ¥Á
! Result in bytes:
! 10 32 32 32 32 32 65 110 100 32 116 104 101 32 118 97 108 117 101 32 111 102 32
! 98 98 98 32 105 115 58 32 -83 -69 32 -76 -28 32 -87 126 32 -91 -63
!
! You would notice that the BIG5 chinese characters output here consist of two
! bytes with a negative value, this is because their ASCII values are larger than
! 127. I think the HTML serializer (and XHTML serializer as well) of Xalan
! treated these characters as special characters and escaped them.
!
! Using InputStreamReader to read in the XML file doesn't seem to be a problem,
! at least in a Linux environment. I can read serialize the DOM document or using
! the XpathAPI and get the result back without any problem.
!
! Come to think of it, may be this is not a problem but a feature of Xalan, to
! escape the characters above 127. The obvious solution is for us in this part of
! the world to use Unicode, instead of double byte character sets (DBCS) as
! currently popular for CJK (Chinese, Japanese and Korean) languages. But for
! various, sometimes non-technical reasons, this is unlikely to happen in the
! future.