[ http://issues.apache.org/jira/browse/XALANJ-2184?page=comments#action_12318257 ]
Brian Minchau commented on XALANJ-2184: --------------------------------------- The value of the encoding attribute of an <xsl:output> element should be an IANA or MIME name. If one looks at the IANA standard at http://www.iana.org/assignments/character-sets, one finds information on various encodings, and for each encoding all of the equivalent aliases for it. For example: Name: IBM278 [RFC1345,KXS2] MIBenum: 2034 Source: IBM NLS RM Vol2 SE09-8002-01, March 1990 Alias: CP278 Alias: ebcdic-cp-fi Alias: ebcdic-cp-se Alias: csIBM278 One should be able to use any of the aliases of a given encoding. So all of these xsl:output elements should result in the same encoding being used. <xsl:output encoding="CP278" /> <xsl:output encoding="ebcdic-cp-fi" /> <xsl:output encoding="ebcdic-cp-se" /> <xsl:output encoding="csIBM278" /> However Xalan-J is written in Java, so ultimately such names must be mapped to a corresponding name to be used by the Java runtime. ************** **** THE INFORMATION IN THE Serializer.properties FILE **** AND ITS FORMAT IS NOT A PUBLIC API. ************** The cooresponding line in the Serlializer.properties file for the same encoding is this: Cp278 EBCDIC-CP-FI,EBCDIC-CP-SE 0x00FF Cp278 is the particular IANA alias recognized by the Java runtime for the encoding. The comma separated list are the other IANA aliases. Our implementation will first map any of the other aliases to the first one, and present the Java runtime only with the first name. Lastly on the line is 0x00FF that is supposed to indicate the largest unicode value in the encoding, but this field is no longer used in Xalan-J 2.7 So we see a bug in the corresponding Serializer.properties file for this encoding, csIBM278 is missing from the alias list. ------------------------------------------------------------------ On the IANA web page there is this information about cp850 and cp860 Name: IBM850 [RFC1345,KXS2] MIBenum: 2009 Source: IBM NLS RM Vol2 SE09-8002-01, March 1990 Alias: cp850 Alias: 850 Alias: csPC850Multilingual Name: IBM860 [RFC1345,KXS2] MIBenum: 2048 Source: IBM NLS RM Vol2 SE09-8002-01, March 1990 Alias: cp860 Alias: 860 Alias: csIBM860 There are no entries for Cp850 and Cp860 in the Serializer.properties file, but if there were then they should look like this: Cp850 850,csPC850Multilingual 0XFFFF Cp860 860,csIBM860 0xFFFF The code in the serializer is written with the intent that if no entry appears in the Encodings.properties file, then the encoding name is used as-is, as a Java name, so Cp850 ought to work, but the serializer gives an error message that the encoding is not supported. Out of curiosity I added the suggested lines to Serializer.properties and suddenly the encodings were recognized. So there seem to be a few things (bugs) to change here: 1) Cp850 and Cp860 should be recognized as they are, with no changes to Serializer.properties because they are IANA names that happen to also be the name that should be recognized by by the Java runtime. 2) The entries in Serializer.properties need to be updated with the information from IANA, whole encodings are missing, and some encodings are missing aliases. 3) a little clean up might be needed, we can drop the value of the code point of the largest unicode value (0xFFFF sort of stuff) from each entry. > cp850, cp860 serialization fails > -------------------------------- > > Key: XALANJ-2184 > URL: http://issues.apache.org/jira/browse/XALANJ-2184 > Project: XalanJ2 > Type: Bug > Components: Serialization > Versions: CurrentCVS > Environment: Windows and Linux, jdk1.5 > Reporter: Pedro Alves > Assignee: Brian Minchau > Priority: Critical > Attachments: cp-850-encoding.tgz, testcase-kaapa-20050804.zip > > Xalan fails serialization for some encodings. Tested with cp850 and cp860. > Testcases are included -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]