Re: Guessing the fallback encoding from the top-level domain name before trying to guess from the browser localization

On 2013/12/23 9:00, Leif Halvard Silli wrote:
> Henri Sivonen, Thu, 19 Dec 2013 16:29:37 +0200:
>>
>> The list of TLDs that participate in the guessing and are not
>> windows-1252-affiliated is currently:
>>
> https://bugzilla.mozilla.org/attachment.cgi?id=8341644&action=diff#a/dom/encoding/domainsfallbacks.properties_sec2

> But not all domains are “legacy domains” either. Consider, from the
> above list, line 139 and 140:
>
>  139 ru=windows-1251
>  140 xn--p1ai=windows-1251
>
> where xn--p1ai refers to the RF-domain - .рф. Is there really no
> correlation between UTF-8 based domain names and use of the UTF-8
> encoding ... ?

I don't think non-ASCII domain names should be called UTF-8 based domain 
names, but the general thought that these rather new domains might 
contain considerably less legacy content than the two-letter ASCII 
country domains seems quite attractive.

Overall, I agree with the question by others of what's the expected 
"ROI" on this is. With UTF-8 being more and more popular for Web sites, 
the return for changing fallback encodings is definitely deminishing.

Regards,   Martin.

Received on Monday, 23 December 2013 09:17:48 UTC