I assume that when you say you are replacing the bytes for ®, that you are 
using the bytes from rawfileOutputStream.toByteArray(), which means that the 
bytes are still good at this point. I would next try to find the ® character in 
the InputStreamReader that is created. I suspect that you will not find the 
character. If that is the case, try instantiating the InputStreamReader with a 
specific character encoding:

new InputStreamReader(rawfileInputStream, "ISO-8859-1")

Try calling getEncoding() on the created InputStreamReader to determine what 
the default encoding is. I believe you will find that it is different between 
the machine that works and the one that doesn't.

Whenever you are converting bytes to characters, or characters to bytes, it is 
a good idea to specify the character encoding that is being used. 

Good luck,
Josh

-----Original Message-----
From: Pramodh Peddi [mailto:[EMAIL PROTECTED]
Sent: Monday, November 10, 2003 12:20 PM
To: Christopher Ebert; [EMAIL PROTECTED]; Kumar, Suraj
Subject: Re: Directly referenced special characters as "?"


I still couldn't get this solved. We did a hack for that deadline -
replacing the bytes for ® by the bytes for the String "(R)".....which is
really a nasty hack.
We are still trying to figure out why it isn't working.

Chris,
I think ® is part of windows-1252 encoding.
http://www.juha.karvonen.name/hyoty/char/ says that. Not sure how genuine
that site is. ® is part of both windows-1252 and iso-8859-1 encodings.
What do you mean by "If you can, check to see what the ® and © characters
are in the Java system"?

Suraj,
I opened the source xml file in XMLSpy and I am able to view the ® as is.

We still couldn't figure out whats the problem and how to solve this
problem. I really wonder if this so tough. Or am I missing something basic.
I am really doing simple stuff. Either reading the source into a String and
passing into the Transformer. Or, passing in the InputStream into the
Transformer.

Please let me know if there are any solutions for this.

Thanks in advance,

Pramodh.


----- Original Message -----
From: "Christopher Ebert" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, November 06, 2003 8:15 AM
Subject: RE: Directly referenced special characters as "?"



If you can, check to see what the ® and © characters are in the Java system.
You have to be careful, because nearly anything you do to print them out may
serialize them to a character set that doesn't have them (and so print a ?).
The surest way is to print out the characters as bytes along with a '?' and
see if they match. This will tell you if you're losing the characters
because they're not in the input character set  or not the correct encoding
for the character set (e.g. not in ISO-8859-1). This often happens with
Windows: Windows uses Cp 1252 as the standard encoding, which is very
similar to ISO-8859-1, but not the same, so it can look like it's working
for a long time*. If so, fix the input encoding, or change all special
characters to entities.

Chris



* See 'Dogg's Hamlet' for further discussion of the nature of this problem:
http://buedg.daig-kastura.de/stoppard/stopp2.htm
-----Original Message-----
From: Pramodh Peddi [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 06, 2003 12:06 AM
To: [EMAIL PROTECTED]
Subject: Directly referenced special characters as "?"


Hi,
I couldn't solve my problem fully yet. I posted a request a couple of days
ago and the responses helped me a bit, but not entirely.

I am having an xml (source) file which has  different special characters -
some of which are referenced thru entities (like &#8482) and others are
referenced directly (like ® and ©). The entity referenced characters are
coming up fine while transforming, but the directly referenced chars are
coming up as "?" chars.

I am using Java1.4.2's Transformer for transforming.

This is what I am doing on the Java code:
*********************************************************

if (filePath != null) {

sftp.get(filePath, rawfileOutputStream);

rawfileOutputStream.close();

}

ByteArrayInputStream rawfileInputStream = new
ByteArrayInputStream(rawfileOutputStream.toByteArray());

ByteArrayOutputStream transformedFileOutputStream = new
ByteArrayOutputStream();


File transformedFile = new File("../server/ic/deploy/data.war/" +
this.taxXSLTResult);

FileOutputStream out = new FileOutputStream(transformedFile);


transformer.transform(

new StreamSource(new InputStreamReader(rawfileInputStream), this.dtdURL),

new StreamResult(out));

rawfileInputStream.close();

transformedFileOutputStream.close();

****************************************************************************
********************

The source file has "windows-1252" encoding header. And in xsl, I tried xsl:
encoding="iso-8859-1" and xsl: encoding = "windows-1252". Niether of these
worked. I even tried to shange the bytes into String and again into bytes.
Nothing works.

I would really appreciate if anyone I can get any help!!


Thanks in advance,

Pramodh.


Reply via email to