Hi Suraj,
I absolutely agree with you. I am spitting the output String into the shell/terminal/console. I am able to see the ? symbols. I see the ? symbols even when i open the output file in VI editor.
 
I think it is while reading the input I am messing up. I don't know if I am messing up, because I tried almost all different possibilities, but nothing worked. Surprising thing is, it works fine on one Solaris machine, but fails on other Solaris machine. This is making the issue complicated.
 
Hope I find the solution very soon!
 
Thanx,
 
Pramodh.
----- Original Message -----
Sent: Monday, November 10, 2003 4:21 PM
Subject: RE: Directly referenced special characters as "?"

Did you try opening output xml in XMLSpy?
I did transformation using xalan command line and generated attached html
file. IE is able to open this file, in UTF-8 encoding(View->Encoding menu).
However if you change the encoding to something else, the html does not show
correctly and displays gibberish.
I was able to get the same result after transformation windows and unix.

How do you intend to consume the output xml? I guess the output would be
read correctly if editor/consumer supports UTF-8 otherwise it would dump
data as it deems fit.



-----Original Message-----
From: Pramodh Peddi [mailto:[EMAIL PROTECTED]
Sent: Monday, November 10, 2003 3:20 PM
To: Christopher Ebert; [EMAIL PROTECTED]; Kumar, Suraj
Subject: Re: Directly referenced special characters as "?"


I still couldn't get this solved. We did a hack for that deadline -
replacing the bytes for ® by the bytes for the String "(R)".....which is
really a nasty hack.
We are still trying to figure out why it isn't working.

Chris,
I think ® is part of windows-1252 encoding.
http://www.juha.karvonen.name/hyoty/char/ says that. Not sure how genuine
that site is. ® is part of both windows-1252 and iso-8859-1 encodings.
What do you mean by "If you can, check to see what the ® and © characters
are in the Java system"?

Suraj,
I opened the source xml file in XMLSpy and I am able to view the ® as is.

We still couldn't figure out whats the problem and how to solve this
problem. I really wonder if this so tough. Or am I missing something basic.
I am really doing simple stuff. Either reading the source into a String and
passing into the Transformer. Or, passing in the InputStream into the
Transformer.

Please let me know if there are any solutions for this.

Thanks in advance,

Pramodh.


----- Original Message -----
From: "Christopher Ebert" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, November 06, 2003 8:15 AM
Subject: RE: Directly referenced special characters as "?"



If you can, check to see what the ® and © characters are in the Java system.
You have to be careful, because nearly anything you do to print them out may
serialize them to a character set that doesn't have them (and so print a ?).
The surest way is to print out the characters as bytes along with a '?' and
see if they match. This will tell you if you're losing the characters
because they're not in the input character set  or not the correct encoding
for the character set (e.g. not in ISO-8859-1). This often happens with
Windows: Windows uses Cp 1252 as the standard encoding, which is very
similar to ISO-8859-1, but not the same, so it can look like it's working
for a long time*. If so, fix the input encoding, or change all special
characters to entities.

Chris



* See 'Dogg's Hamlet' for further discussion of the nature of this problem:
http://buedg.daig-kastura.de/stoppard/stopp2.htm
-----Original Message-----
From: Pramodh Peddi [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 06, 2003 12:06 AM
To: [EMAIL PROTECTED]
Subject: Directly referenced special characters as "?"


Hi,
I couldn't solve my problem fully yet. I posted a request a couple of days
ago and the responses helped me a bit, but not entirely.

I am having an xml (source) file which has  different special characters -
some of which are referenced thru entities (like &#8482) and others are
referenced directly (like ® and ©). The entity referenced characters are
coming up fine while transforming, but the directly referenced chars are
coming up as "?" chars.

I am using Java1.4.2's Transformer for transforming.

This is what I am doing on the Java code:
*********************************************************

if (filePath != null) {

sftp.get(filePath, rawfileOutputStream);

rawfileOutputStream.close();

}

ByteArrayInputStream rawfileInputStream = new
ByteArrayInputStream(rawfileOutputStream.toByteArray());

ByteArrayOutputStream transformedFileOutputStream = new
ByteArrayOutputStream();


File transformedFile = new File("../server/ic/deploy/data.war/" +
this.taxXSLTResult);

FileOutputStream out = new FileOutputStream(transformedFile);


transformer.transform(

new StreamSource(new InputStreamReader(rawfileInputStream), this.dtdURL),

new StreamResult(out));

rawfileInputStream.close();

transformedFileOutputStream.close();

****************************************************************************
********************

The source file has "windows-1252" encoding header. And in xsl, I tried xsl:
encoding="iso-8859-1" and xsl: encoding = "windows-1252". Niether of these
worked. I even tried to shange the bytes into String and again into bytes.
Nothing works.

I would really appreciate if anyone I can get any help!!


Thanks in advance,

Pramodh.



Reply via email to