http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1950
*** shadow/1950 Sun Jul 22 00:37:50 2001
--- shadow/1950.tmp.20424 Sun Jul 22 01:34:10 2001
***************
*** 180,183 ****
the world to use Unicode, instead of double byte character sets (DBCS) as
currently popular for CJK (Chinese, Japanese and Korean) languages. But for
various, sometimes non-technical reasons, this is unlikely to happen in the
! future.
--- 180,237 ----
the world to use Unicode, instead of double byte character sets (DBCS) as
currently popular for CJK (Chinese, Japanese and Korean) languages. But for
various, sometimes non-technical reasons, this is unlikely to happen in the
! future.
!
! ------- Additional Comments From [EMAIL PROTECTED] 2001-07-22 01:34 -------
! Terence -
! First, try executing this from the command line and you should see that
! everything is okay:
! java org.apache.xalan.xslt.Process -in big5.xml -xsl testing.xsl -out
! testing.out
! (this is all on one line).
!
! The problem in your example was twofold: First you used an input reader instead
! of an input stream. The problem you're still having is because you're using an
! output Writer instead of an output stream.
!
! Internally, everything is carried in Unicode since these are java character
! strings. There are three conversions going on:
! input XML encoding -> unicode
! input XSL encoding -> unicode
! unicode -> output encoding
!
! By specifying an output Writer in your transform Result, you override the
! encoding attribute of the xsl:output element and cause the result string to be
! handled in Unicode. I don't know how you're converting the result string to a
! byte array for your debug printing but that is where the output conversion is
! actually taking place.
!
! Your original example appeared to work because the encoding into Unicode and the
! decoding from Unicode were both handled by the java Reader/Writer mechanism so
! the encoding and decoding errors compensated. However, unless your platform
! default encoding is Big5 it is unlikely that the conversion into Unicode was
! accurate. This means that you would have problems if your input document and
! stylesheet were in different encodings.
!
! If you want to return a string from your transform method, best bet is to use a
! ByteArrayOutputStream as your Result, like this:
!
! transformer.transform(source, new StreamResult(baos);
! baos.close();
! result = baos.toString();
!
! Now, XalanJ is writing out the characters using the Big5 encoding. Assuming
! your platform default encoding is Big5, the baos.toString() call will convert
! the Big5 encoded characters in the byte array to unicode. Then, your
! System.out.println() call will convert the unicode back into Big5 encoding.
!
! There is no reason that you should have to use Unicode as you mentioned. Big5
! is perfectly fine but you need to understand when the conversions from Big5 to
! Unicode and back are being performed. This is -very- tricky and confusing so
! please come back with more questions if you have them. If you do respond,
! please let me know your platform default encoding (the value of the java
! file.encoding System property) and how you're converting your string to bytes
! for printing out your debugging information. Sending the actual code for this
! would be the most clear.
!
! Gary