W3C home > Mailing lists > Public > public-html@w3.org > October 2009

Re: Locale/default encoding table

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Wed, 14 Oct 2009 06:03:03 +0200
Message-ID: <4AD54D77.30502@xn--mlform-iua.no>
To: Ian Hickson <ian@hixie.ch>
CC: Geoffrey Sneddon <gsneddon@opera.com>, HTML WG <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Leif Halvard Silli On 09-10-14 05.40:

> Ian Hickson On 09-10-14 05.04:


Explanation: I just say that the table below is constructed by 
comparing Henri's "Mozilla corpus" with your table, Ian. So, 
except for Welsh, which is mentioned in your table, the rest falls 
in under what you label as "all other locales". I honestly do not 
believe that everyone reading this algorithm will trust it, when 
they find out that e.g. Mongolian should default to Windows 1252. 
So even if the defaults would remain as is, they should be 
explicitly mentioned in locale table (if at all the locale table 
should be there at all, which I agree with Addison is very 
questionable). Leif

> I cannot see that you have refuted my claim that you need to 
> specify those exact locales for which you think Win 1252 should be 
> the default.
> 
> Also: Above you talked about legacy surrogate locales that are 
> similar but not identical. By "similar" you of course at least 
> have in mind "same script". So, could explain me why browsers must 
> have the following defaults?
> 
> Ian's   -        -
> default - Locale - Script
> --------|--------|----------------
> win1252 - bn-BD  - Not Latin: Bengali Bangladesh
> win1252 - bn-IN   Not Latin: Benagli India
> win1252 - el      Not Latin: Greek
> win1252 - eo      Win1252 doesn't fully cover Esperanto
> win1252 - mn      Not Latin: 90% cyrillic users
> win1252 - mr      Not Latin: Deva script
> win1252 - or      Not Latin: Orya script
> win1252 - ta      Not Latin: Tamil script
> win1252 - ta-LK   Not Latin: (Tamil script?)
> UTF-8   - cy     - Win1252 doesn't fully cover Welsh
> 
> Note that in this nice bunch of "Western demographics", comes last 
> but not least Welsh! With UTF-8. Situated as it is in the midst of 
> the united kingdom that spread the English alphabet to the world 
> more than any other. Also note that Esperanto users defaults to 
> win1252 ...
> 
> Why is it safer for Welsh to use UTF-8 as default. But not for 
> those languages that doesn't use Latin at all? Also see Andrew's 
> letter [1]
> 
> As I have understood it, market share is the bible here. So what 
> could go wrong if one started to have, what looks as a more 
> reasonable defaults, in for the encodings of that table?
> 
> Also, again: I took up Belarusian. Why does it have ISO-8859-5 as 
> default? Do you just trust whatever comes out of Mozilla?
> 
> Or perhaps we aren't supposed to take that table very seriously.
> 
> [1] http://www.w3.org/mid/4AD4D3F4.5010708@xn--mlform-iua.no
Received on Wednesday, 14 October 2009 04:03:40 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:50 GMT