On 12/5/11 12:42 PM, Leif Halvard Silli wrote:
Last I checked, some of those locales defaulted to UTF-8. (And HTML5
defines it the same.) So how is that possible?

Because authors authoring pages that users of those locales tend to use use UTF-8 more than anything else?

Don't users of those locales travel as much as you do?

People on average travel less than David does, yes.  In all locales.

But that's not the point. I think you completely misunderstood his comments about travel and locales. Keep reading.

What kind of trouble are you actually describing here? You are
describing a problem with using UTF-8 for *your locale*.

No. He's describing a problem using UTF-8 to view pages that are not written in English.

Now what language are the non-English pages you look at written in? Well, it depends. In western Europe they tend to be in languages that can be encoded in ISO-8859-1, so authors sometimes use that encoding (without even realizing it). If you set your browser to default to UTF-8, those pages will be broken.

In Japan, a number of pages are authored in Shift_JIS. Those will similarly be broken in a browser defaulting to UTF-8.

What is your locale?

Why does it matter? David's default locale is almost certainly en-US, which defaults to ISO-8859-1 (or whatever Windows-??? encoding that actually means on the web) in his browser. But again, he's changed the default encoding from the locale default, so the locale is irrelevant.

(Quite often it sounds as
if some see Latin-1 - or Windows-1251 as we now should say - as a
'super default' rather than a locale default. If that is the case, that
it is a super default, then we should also spec it like that! Until
further, I'll treat Latin-1 as it is specced: As a default for certain
locales.)

That's exactly what it is.

Since it is a locale problem, we need to understand which locale you
have - and/or which locale you - and other debaters - think they have.

Again, doesn't matter if you change your settings from the default.

However, you also say that your problem is not so much related to pages
written for *your* locale as it is related for pages written for users
of *other* locales. So how many times per year do Dutch, Spanish or
Norwegian  - and other non-English pages - are creating troubles for
you, as a English locale user? I am making an assumption: Almost never.
You don't read those languages, do you?

Did you miss the "travel" part? Want to look up web pages for museums, airports, etc in a non-English speaking country? There's a good chance they're not in English!

This is also an expectation thing: If you visit a Russian page in a
legacy Cyrillic encoding, and gets mojibake because your browser
defaults to Latin-1, then what does it matter to you whether your
browser defaults to Latin-1 or UTF-8? Answer: Nothing.

Yes.  So?

I think we should 'attack' the dominating locale first: The English
locale, in its different incarnations (Australian, American, UK). Thus,
we should turn things on the head: English users should start to expect
UTF-8 to be used. Because, as English users, you are more used to
'mojibake' than the rest of us are: Whenever you see it, you 'know'
that it is because it is a foreign language you are reading.

Modulo smart quotes (and recently unicode ellipsis characters). These are actually pretty common in English text on the web nowadays, and have a tendency to be in "ISO-8859-1".

Or, please, explain to us when and where it
is important that English language users living in their own, native
lands so to speak, need that their browser default to Latin-1 so that
they can correctly read English language pages?

See above.

See? We would have a plan. Or what do you think?

Try it in your browser. When I set UTF-8 as my default, there were broke quotation marks all over the web for me. And I'm talking pages in English.

-Boris

Reply via email to