On 11/01/06, Lachlan Hunt <[EMAIL PROTECTED]> wrote: > liorean wrote: > > Character references refer to Unicode code points independent of the > > document encoding and character set. At least for HTML4 and XML, if > > not for HTML3.2. > > As far as character references in HTML are concerned, they have always > referred to the Unicode code points since HTML 2.0.
Ah. I just saw BASESET "ISO 646:1983//CHARSET International Reference Version (IRV)//ESC 2/5 4/0" BASESET "ISO Registration Number 100//CHARSET ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4/1" in HTML3.2 and BASESET "ISO Registration Number 177//CHARSET ISO/IEC 10646-1:1993 UCS-4 with implementation level 3//ESC 2/5 2/15 4/6" in HTML4.01 SGML declarations and assumed the first one (ISO-646) was ANSI, the second one (ECMA-94) was the extended 8-bit characters (latin-1) and the third one (ISO-10646) was Unicode. This assumption was wrong? > See my article: > http://lachy.id.au/log/2005/10/char-refs > (take note of the comments too, which contain a few corrections) I read it months ago :) -- David "liorean" Andersson <uri:http://liorean.web-graphics.com/> ****************************************************** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help ******************************************************