Andrew Cunningham wrote:
Lachlan Hunt writes:
Andrew Cunningham wrote:
In theory the docuemnt should only be in one of the unicode encodings, so without a BOM, the browser should try to render it as UTF-8.

No, because when it's served as text/html, HTML rules apply, not XML rules. So without the encoding declared in the HTTP headers or the meta element, the default of ISO-8859-1 should be used (if served over HTTP, technically US-ASCII otherwise). However, browsers will actually interpret ISO-8859-1 as the Windows-1252 superset and will also attempt to use unspecified heuristics to guess the encoding, before falling back to the default.

If you're going by the HTTP specs.

Yes, of course, as well as the relevant RFCs for the MIME types.

If you go by the XHTML 1.0 recomendation, appendic C

Appendix C is non-normative.

would indicate that "... that when the XML declaration is not included in a document, the document can only use the default character encodings UTF-8 or UTF-16.:

That is only true for XML on the condition that the encoding has not been specified by a higher level protocol. The relevant *normative* section of the XML rec. states in 4.3.3 Character Encoding in Entities:

In the absence of information provided by an external transport protocol (e.g. HTTP or MIME), it is a fatal error for an entity including an encoding declaration to be presented to the XML processor in an encoding other than that named in the declaration, or for an entity which begins with neither a Byte Order Mark nor an encoding declaration to use an encoding other than UTF-8. Note that since ASCII is a subset of UTF-8, ordinary ASCII entities do not strictly need an encoding declaration.

http://www.w3.org/TR/REC-xml/#charencoding

The point I wnated to make is that there is another way to declare encoding for docuemnts in UTF-16 or UTF-32: and thats teh BOM; and that the test should also include BOM detection as an option,

Not according to the HTML 4 Rec, but...

i.e. do various  web browsers use the BOM as part of their heuristics.

In reality, yes.

--
Lachlan Hunt
http://lachy.id.au/
******************************************************
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
******************************************************

Reply via email to