W3C home > Mailing lists > Public > whatwg@whatwg.org > February 2014

Re: [whatwg] Guessing the fallback encoding from the top-level domain name before trying to guess from the browser localization

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 7 Feb 2014 22:37:34 +0000 (UTC)
To: Henri Sivonen <hsivonen@hsivonen.fi>
Message-ID: <alpine.DEB.2.00.1402072231370.30855@ps20323.dreamhostps.com>
Cc: WHATWG <whatwg@whatwg.org>
On Thu, 19 Dec 2013, Henri Sivonen wrote:
> 
> Considering that the encoding of the content browsed is not really a 
> function of the UI localization of the browser, though the two are often 
> correlated, I have developed a patch for Firefox to make the guess based 
> on the top-level domain name of the URL of the document when possible.
> 
> Before deciding whether to land that patch, I'd like to get feedback 
> from the broader Web standards community.
> 
> Does this seem like a good idea? Good idea if the mapping details are 
> tweaked? Bad idea? (Why?)

Seems like a reasonable idea to me. The correlation should be at least as 
high, as far as I can tell. But that's just a guess. Data would be good, 
for example instrumenting an existing locale-based browser to see how 
often the guess from the locale disagrees with the guess from the TLD, and 
checking how often the guess from the locale is wrong (via looking at 
people overriding the encoding manually). Or maybe a 50%/50% experiment 
with that as the first 50% and the default coming from the TLD instead of 
the UI locale in the second 50%, with the corresponding instrumentation, 
to see how the results compare.

Have you tried deploying this? What have you learnt so far?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 7 February 2014 22:38:37 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 17:00:15 UTC