Yes, David, your second guess is correct. I didn't check if the numeric sequences are correct - I'm just worried about the fact that old browsers would not recognise them at all...
Anyways, old browsers understand koi-8. Thanks! Best, Dimitry -----Original Message----- From: David N Bertoni/CAM/Lotus [mailto:[EMAIL PROTECTED] Sent: Thursday, January 24, 2002 9:24 PM To: [email protected] Subject: RE: Xalan-C + ICU: windows-1251 encoding troubles Hi Dmitry, Just to be clear, are you saying there's something wrong with file? What I mean is, are we writing incorrect numeric references? If so, then we need to make this a higher priority. If this is the case, you should file a bug report. If not, then I don't think it's accurate to say we don't "support" Win-1251. An HTML or XML file with correct numeric character references is logically equivalent to a file which contains the actual characters. Perhaps you're struggling with browsers that don't behave correctly with numeric character references? Dave "Dimitry Chernyshov" To: <[email protected]> <[EMAIL PROTECTED] cc: n.ru> Subject: RE: Xalan-C + ICU: windows-1251 encoding troubles 01/24/2002 10:19 AM I see... Well, it's not so urgent - fortunately we can still use KOI8... However, it would be very nice to have win-1251 support, 'cause it's the most popular encoding for Russian language. Thanks for explanation though! Best, Dimitry Chernyshov, Technology Group Managing Director, Polar Design -------------------------- [EMAIL PROTECTED] http://www.polardesign.com phone/fax: +7 (095) 363 0708 -----Original Message----- From: David N Bertoni/CAM/Lotus [mailto:[EMAIL PROTECTED] Sent: Thursday, January 24, 2002 8:59 PM To: [email protected] Subject: Re: Xalan-C + ICU: windows-1251 encoding troubles This is not surprising. Currently, Xalan-C has a pretty brain-dead algorithm for determining whether or not to write the actual character or a numeric character reference. The problem is that checking each character is horribly expensive, so we just punt on things > 256 in many cases. I'd like to do something about it, but it's not the highest priority right now. Unless you can actually determine that we're not emitting the correct numeric character reference, there's nothing wrong with doing it the way we're doing it. You can always post-process the file yourself if you object to the references. I'll bump this up on the list of things to work on for the next release. Dave "Dimitry Chernyshov" To: <[email protected]> <[EMAIL PROTECTED] cc: (bcc: David N Bertoni/CAM/Lotus) n.ru> Subject: Xalan-C + ICU: windows-1251 encoding troubles 01/24/2002 08:32 AM Hi! After I re-built Xalan-c1_3 + Xerces-c1_6_0 + ICU 2.0, Xalan works good with different encodings. Though, I've encountered one pretty strange problem. If an XSL file has xsl:output encoding set to "windows-1251" (<xsl:output method="html" encoding="windows-1251"/>) while transforming some XML, the result contains character codes instead of the characters themselves. E.g. : <html> <head> <DEFANGED_META http-equiv="Content-Type" content="text/html; charset=windows-1251"> <title>Типа винды блин!</title> However, if a source XML has "windows-1251" encoding and XSL file has encoding set to, say, KOI8-R - everything works just fine: Xalan (ICU, I guess) transforms win-1251 to KOI8-R correctly... Any thoughts? Thanks in advance, Dimitry Chernyshov, Technology Group Managing Director, Polar Design -------------------------- [EMAIL PROTECTED] http://www.polardesign.com phone/fax: +7 (095) 363 0708
