W3C home > Mailing lists > Public > public-i18n-core@w3.org > October to December 2009

RE: Locale/default encoding table

From: Ian Hickson <ian@hixie.ch>
Date: Thu, 15 Oct 2009 09:26:34 +0000 (UTC)
To: "public-i18n-core@w3.org" <public-i18n-core@w3.org>, HTML WG <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0910150913440.3716@hixie.dreamhostps.com>
On Wed, 14 Oct 2009, Phillips, Addison wrote:
> > > 
> > > The text the I18N WG proposed allows the current behavior, which is 
> > > all that is necessary on a normative level. It uses examples instead 
> > > of normative language. I'm completely mystified as to why Ian won't 
> > > discuss that text directly.
> >
> > If you mean the text proposed here:
> > 
> >    http://lists.w3.org/Archives/Public/public-html/2009Aug/1040.html
> > 
> > ...then I discussed it here:
> > 
> >    http://lists.w3.org/Archives/Public/public-html/2009Oct/0281.html
> 
> I had not seen your response yet and have not yet digested it fully.

That would explain why you were so completely mystified. :-)


> I take it you don't like our text? :-)

It's not so much that I dislike it so much as I don't understand what 
problem it was solving, as discussed in the e-mail cited above.


> I *do* think that browsers should preserve their existing behavior from 
> the point of view that they should have a localizable default encoding 
> and also offer the user the ability to override that default.
> 
> What I'm objecting to is preserving the *particular* localized choices 
> that exist right now today by fiat and effectively "forever".

I don't think anyone is suggesting making the table normative. It's merely 
a suggestion of what is most likely to be compatible with today's content.


> The most common unlabeled encoding tends to follow the most common 
> labeled encodings for a given audience because that it is how user's 
> browsers are set up.

I question that conclusion. I would like to see data to back this up. 
Personally I would expect there to have been a divergence between the 
encoding most commonly expected by unlabeled content and the most commonly 
used encoding in labeled content.


On Wed, 14 Oct 2009, Leif Halvard Silli wrote:
> > On Wed, 14 Oct 2009, Leif Halvard Silli wrote:
> > > So where does Windows 1252 as default for Bengali, Tamil etc fit in 
> > > here?
> > 
> > At a guess, pages in those languages are mostly correctly labeled or 
> > correctly autodetected, and so the fallback is unnecessary;
> 
> If "unnecessary", then why default to Windows 1252?

Maybe because that isn't the real reason. I gave two other guesses:

> > or the users use more pages from "Western European" languages (as you 
> > put it) than their own.
> >
> > Or, of course, the default Mozilla uses could be wrong.

...and all three of these are merely guesses. Maybe the reality is a 
fourth reason altogether.

The table in the spec is based on data. I don't intend to change it based 
on guesses. If you have data that contradicts the data that I used to 
generate the table, then please, file a bug to request that the table be 
changed, citing that data.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 15 October 2009 09:15:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 15 October 2009 09:15:27 GMT