- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Mon, 05 Dec 2011 16:49:45 -0500
On 12/5/11 12:42 PM, Leif Halvard Silli wrote: > Last I checked, some of those locales defaulted to UTF-8. (And HTML5 > defines it the same.) So how is that possible? Because authors authoring pages that users of those locales tend to use use UTF-8 more than anything else? > Don't users of those locales travel as much as you do? People on average travel less than David does, yes. In all locales. But that's not the point. I think you completely misunderstood his comments about travel and locales. Keep reading. > What kind of trouble are you actually describing here? You are > describing a problem with using UTF-8 for *your locale*. No. He's describing a problem using UTF-8 to view pages that are not written in English. Now what language are the non-English pages you look at written in? Well, it depends. In western Europe they tend to be in languages that can be encoded in ISO-8859-1, so authors sometimes use that encoding (without even realizing it). If you set your browser to default to UTF-8, those pages will be broken. In Japan, a number of pages are authored in Shift_JIS. Those will similarly be broken in a browser defaulting to UTF-8. > What is your locale? Why does it matter? David's default locale is almost certainly en-US, which defaults to ISO-8859-1 (or whatever Windows-??? encoding that actually means on the web) in his browser. But again, he's changed the default encoding from the locale default, so the locale is irrelevant. > (Quite often it sounds as > if some see Latin-1 - or Windows-1251 as we now should say - as a > 'super default' rather than a locale default. If that is the case, that > it is a super default, then we should also spec it like that! Until > further, I'll treat Latin-1 as it is specced: As a default for certain > locales.) That's exactly what it is. > Since it is a locale problem, we need to understand which locale you > have - and/or which locale you - and other debaters - think they have. Again, doesn't matter if you change your settings from the default. > However, you also say that your problem is not so much related to pages > written for *your* locale as it is related for pages written for users > of *other* locales. So how many times per year do Dutch, Spanish or > Norwegian - and other non-English pages - are creating troubles for > you, as a English locale user? I am making an assumption: Almost never. > You don't read those languages, do you? Did you miss the "travel" part? Want to look up web pages for museums, airports, etc in a non-English speaking country? There's a good chance they're not in English! > This is also an expectation thing: If you visit a Russian page in a > legacy Cyrillic encoding, and gets mojibake because your browser > defaults to Latin-1, then what does it matter to you whether your > browser defaults to Latin-1 or UTF-8? Answer: Nothing. Yes. So? > I think we should 'attack' the dominating locale first: The English > locale, in its different incarnations (Australian, American, UK). Thus, > we should turn things on the head: English users should start to expect > UTF-8 to be used. Because, as English users, you are more used to > 'mojibake' than the rest of us are: Whenever you see it, you 'know' > that it is because it is a foreign language you are reading. Modulo smart quotes (and recently unicode ellipsis characters). These are actually pretty common in English text on the web nowadays, and have a tendency to be in "ISO-8859-1". > Or, please, explain to us when and where it > is important that English language users living in their own, native > lands so to speak, need that their browser default to Latin-1 so that > they can correctly read English language pages? See above. > See? We would have a plan. Or what do you think? Try it in your browser. When I set UTF-8 as my default, there were broke quotation marks all over the web for me. And I'm talking pages in English. -Boris
Received on Monday, 5 December 2011 13:49:45 UTC