On 10/22/11 6:09 AM, Daniel Glazman wrote:
text/plain; charset=iso-8859-1

This is wrong. Nothing in the MIME or the HTTP specs says such a
whitespace is mandatory. Whitespace is explicitely forbidden between
type and subtype, between parameter-name and parameter-value, but that's
all. AFAIC, |text/plain;charset=iso-8859-1| is perfectly valid and
|text/plain ; charset=iso-8859-1| is perfectly valid too.

We do not want to sniff text/plain more than strictly necessary.

Sorry, I don't understand that answer, what do you mean exactly ?

Normally, when a browser receives a header of the form "text/plain ...." where ... is anything, it should treat the page as text-plain.

However, there is a known bug in old Apache installations where Apache defaulted to sending a type of "text/plain" or "text/plain; charset=iso-8859-1" or "text/plain; charset=ISO-8859-1" or "text/plain; charset=UTF8" (depending on the installation) any time it didn't know what type of data the file was.

Therefore, it is fairly common for random binary files to be served with those 4 exact header values. Thus, if those _exact_ strings are encountered the UA needs to sniff to make sure it's not actually binary.

If I read the document correctly, UAs are going to fallback to complex
type detection with perf and time cost just because the content-type
detection did not honour the potential presence of whitespace ???
Really ?

You read it wrong. If the whitespace doesn't match the exact values in the table, the UA will just treat the page as text/plain. It's only when the header value is exactly one of the 4 in the table that the UA will go into http://mimesniff.spec.whatwg.org/#text-or-binary

-Boris

Reply via email to