Andrew Cunningham wrote:
I was wondering if you should have another test in there: XHTML document with no encoding declared in the http header or in a meta tag, and no xml declaration. Sent as html/text.

That's text/html and an XHTML document served as text/html is HTML, regardless of any lies the DOCTYPE tells you.

In theory the docuemnt should only be in one of the unicode encodings, so without a BOM, the browser should try to render it as UTF-8.

No, because when it's served as text/html, HTML rules apply, not XML rules. So without the encoding declared in the HTTP headers or the meta element, the default of ISO-8859-1 should be used (if served over HTTP, technically US-ASCII otherwise). However, browsers will actually interpret ISO-8859-1 as the Windows-1252 superset and will also attempt to use unspecified heuristics to guess the encoding, before falling back to the default.

Lachlan Hunt
