[ 
http://issues.apache.org/jira/browse/XALANJ-2184?page=comments#action_12318257 
] 

Brian Minchau commented on XALANJ-2184:
---------------------------------------

The value of the encoding attribute of an <xsl:output> element should be 
an IANA or MIME name.  If one looks at the IANA standard at
http://www.iana.org/assignments/character-sets, one finds
information on various encodings, and for each encoding
all of the equivalent aliases for it.  For example:

Name: IBM278                                              [RFC1345,KXS2]
MIBenum: 2034
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990
Alias: CP278
Alias: ebcdic-cp-fi
Alias: ebcdic-cp-se
Alias: csIBM278

One should be able to use any of the aliases of a given encoding.
So all of  these xsl:output elements should result in the same encoding being 
used.
<xsl:output encoding="CP278" />
<xsl:output encoding="ebcdic-cp-fi" />
<xsl:output encoding="ebcdic-cp-se" />
<xsl:output encoding="csIBM278" />



 However Xalan-J is written in Java, so ultimately such names
must be mapped to a corresponding name to be used by the Java runtime. 

**************
**** THE INFORMATION IN THE Serializer.properties FILE
**** AND ITS FORMAT IS NOT A PUBLIC API.
**************

The cooresponding line in the Serlializer.properties file for the same encoding 
is this:
  Cp278 EBCDIC-CP-FI,EBCDIC-CP-SE 0x00FF

Cp278 is the particular IANA alias recognized by the Java runtime for the 
encoding.
The comma separated list are the other IANA aliases. Our implementation will
first map any of the other aliases to the first one, and present the Java 
runtime
only with the first name.

Lastly on the line is 0x00FF that is supposed to indicate the largest unicode 
value in the encoding,
but this field is no longer used in Xalan-J 2.7

So we see a bug in the corresponding Serializer.properties file for this 
encoding,
csIBM278 is missing from the alias list.

------------------------------------------------------------------
On the IANA web page there is this information about cp850 and cp860

Name: IBM850                                              [RFC1345,KXS2]
MIBenum: 2009
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990
Alias: cp850
Alias: 850
Alias: csPC850Multilingual

Name: IBM860                                              [RFC1345,KXS2]
MIBenum: 2048
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990
Alias: cp860
Alias: 860
Alias: csIBM860
 
There are no entries for Cp850 and Cp860 in the Serializer.properties file, 
but if there were then they should look like this:
Cp850  850,csPC850Multilingual  0XFFFF
Cp860 860,csIBM860  0xFFFF

The code in the serializer is written with the intent that if no entry appears 
in the Encodings.properties file,
then the encoding name is used as-is, as a Java name, so Cp850 ought to work, 
but the serializer gives an error message that
the encoding is not supported.

Out of curiosity I added the suggested lines to Serializer.properties and 
suddenly the encodings were recognized.

So there seem to be a few things (bugs) to change here:

1) Cp850 and Cp860 should be recognized as they are, with no changes to 
Serializer.properties
because they are IANA names that happen to also be the name that should be 
recognized by by the Java runtime.

2) The entries in Serializer.properties need to be updated with the information 
from IANA, whole encodings are missing, and some encodings
are missing aliases.

3) a little clean up might be needed, we can drop the value of the code point 
of the largest unicode value (0xFFFF sort of stuff) from each entry.



> cp850, cp860 serialization fails
> --------------------------------
>
>          Key: XALANJ-2184
>          URL: http://issues.apache.org/jira/browse/XALANJ-2184
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Versions: CurrentCVS
>  Environment: Windows and Linux, jdk1.5
>     Reporter: Pedro Alves
>     Assignee: Brian Minchau
>     Priority: Critical
>  Attachments: cp-850-encoding.tgz, testcase-kaapa-20050804.zip
>
> Xalan fails serialization for some encodings. Tested with cp850 and cp860. 
> Testcases are included

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to