- From: Asmus Freytag <asmusf@ix.netcom.com>
- Date: Fri, 27 Sep 2019 02:59:36 -0700
- To: www-international@w3.org
- Message-ID: <ea3ee823-8e49-a475-7b5f-8a7208a79c73@ix.netcom.com>
On 9/27/2019 2:03 AM, Martin J. Dürst wrote: > Hello Albretch, > > I cannot imagine what the use for a list like the one you describe would > actually be. Maybe you can clarify. > > I have seen attempts at creating such lists in the 1990ies. But these > days, the best advice, as you mention, is "just use UTF-8". > > Regards, Martin. +1 Was about to write the same. Martin saved me the trouble, A./ > > On 2019/09/26 20:09, Albretch Mueller wrote: >> I have found lists based on the ISO 639 such as those used by the US >> Library of Congress, which contain the ISO 639-1, 2 and 3 codes for >> the representation of names of languages, but I am not able to find a >> languages to encodings associations list. For example, even if helpful >> towards my goal, this one: >> >> https://docs.python.org/2/library/codecs.html >> >> Doesn't really give you a language-encodings association and >> languages such as the second and fourth most spoken by # of native >> speakers (Spanish and Hindi) are not listed. >> >> UTF-8 could be used to encode any language but that is not so with >> all other encodings. >> >> Basically what I have in mind is some data looking like: >> >> ISO-639-3|ISO-639-2|ISO-639-1|Name of language|Name of language as >> java " \uffff" unicode format|all encodings that can be used with that >> language. >> >> Example, these would be the first 5 fields of three languages: >> >> |tur|tur|tr|Türkçe|\u0054\u00fc\u0072\u006b\u00e7\u0065| >> |rus|rus|ru|Русский|\u0420\u0443\u0441\u0441\u043a\u0438\u0439| >> |spa|spa|es|Español|\0045\u0073\u0070\u0061\u00f1\u006f\u006c| >> >> and after those initial four fields, all possible specific encodings >> used for the language >> >> I thought such a list should be easy to find out there. >> >> Any lists of documentations you would suggest? >> >> lbrtchx >>
Received on Friday, 27 September 2019 09:59:54 UTC