On Tue, 13 May 2014, Michal Zalewski wrote: > > We probably can't support a well-defined algorithm for detecting > documents that have distinctive signatures while safely supporting > formats that don't have them (because there is always a possibility that > the non-structured format with user-controlled data could be used to > forge a signature).
Right. You'd have to check the Content-Type header first. On Tue, 13 May 2014, Michal Zalewski wrote: > > In general, in the past, in pretty much every single instance where > browsers tried to second-guess Content-Type or Content-Disposition > headers - be it through sketchy proprietary content-sniffing heuristics > or through well-defined algorithms - this ended up creating tons of > hard-to-fix security problems and introduced new burdens for web > developers. It looks elegant, but it's almost always a huge liability. I disagree. Much of the Web actually relies on this today, and for the most part it works. For example, when you do: <img src="foo" ...> ...the Content-Type is ignored except for SVG. > I think that most or all browsers are moving pretty firmly in the other > direction, enforcing C-T checking even in situations that historically > were off-limits (<script>, <style>, <object>, etc), based on strong > evidence of previous mishaps; to the extent that the spec diverges from > this, I suspect that it will be only a source of confusion and > incompatibility. Actually as far as I can tell we're converging on a hybrid model, more or less the one specified here: http://mimesniff.spec.whatwg.org/ -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'