Re: Guessing the fallback encoding from the top-level domain name before trying to guess from the browser localization from Leif Halvard Silli on 2013-12-23 (www-international@w3.org from October to December 2013)

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Mon, 23 Dec 2013 01:00:53 +0100
To: Henri Sivonen <hsivonen@hsivonen.fi>
Cc: www-international@w3.org
Message-ID: <20131223010053274802.601025fd@xn--mlform-iua.no>

Henri Sivonen, Thu, 19 Dec 2013 16:29:37 +0200:
> 
> The list of TLDs that participate in the guessing and are not
> windows-1252-affiliated is currently:
> 
https://bugzilla.mozilla.org/attachment.cgi?id=8341644&action=diff#a/dom/encoding/domainsfallbacks.properties_sec2

> 
> UTF-8 is never guessed, since it is not a legacy encoding.

But not all domains are “legacy domains” either. Consider, from the 
above list, line 139 and 140:

 139 ru=windows-1251
 140 xn--p1ai=windows-1251

where xn--p1ai refers to the RF-domain - .рф. Is there really no 
correlation between UTF-8 based domain names and use of the UTF-8 
encoding ... ?
-- 
leif halvard silli

Received on Monday, 23 December 2013 00:01:22 UTC