Re: [whatwg] Encoding Sniffing

Alexey Proskuryakov Mon, 23 Apr 2012 10:59:14 -0700

21.04.2012, в 3:21, Anne van Kesteren написал(а):

> 1) Is this something we want to define and eventually implement the same way?


I think that the general direction should be getting rid of encoding sniffing. 
It's very rarely helpful if ever, and implementations are wildly different.

WebKit can optionally use ICU for charset detection. We also have custom 
built-in heuristics to switch between Japanese encodings only (think rendering 
unlabeled EUC-JP pages when default browser encoding is set to Shift-JIS). 
Safari doesn't enable ICU based detection to no visible user disconcert, and I 
don't know if the Japanese heuristics are still important.

> 2) Does this need to apply outside HTML? For JavaScript it forbidden per the 
> HTML standard at the moment. CSS and XML do not allow it either. Is it used 
> for decoding text/plain at the moment?
> 3) Is there a limit to how many bytes we should look at?

Related to the last question, WebKit doesn't implement re-navigation (neither 
for charset sniffing, nor for <meta charset>), and I don't think that we ever 
should.

- WBR, Alexey Proskuryakov

Re: [whatwg] Encoding Sniffing

Reply via email to