- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Fri, 27 Sep 2019 09:03:38 +0000
- To: Albretch Mueller <lbrtchx@gmail.com>, "www-international@w3.org" <www-international@w3.org>
Hello Albretch, I cannot imagine what the use for a list like the one you describe would actually be. Maybe you can clarify. I have seen attempts at creating such lists in the 1990ies. But these days, the best advice, as you mention, is "just use UTF-8". Regards, Martin. On 2019/09/26 20:09, Albretch Mueller wrote: > I have found lists based on the ISO 639 such as those used by the US > Library of Congress, which contain the ISO 639-1, 2 and 3 codes for > the representation of names of languages, but I am not able to find a > languages to encodings associations list. For example, even if helpful > towards my goal, this one: > > https://docs.python.org/2/library/codecs.html > > Doesn't really give you a language-encodings association and > languages such as the second and fourth most spoken by # of native > speakers (Spanish and Hindi) are not listed. > > UTF-8 could be used to encode any language but that is not so with > all other encodings. > > Basically what I have in mind is some data looking like: > > ISO-639-3|ISO-639-2|ISO-639-1|Name of language|Name of language as > java " \uffff" unicode format|all encodings that can be used with that > language. > > Example, these would be the first 5 fields of three languages: > > |tur|tur|tr|Türkçe|\u0054\u00fc\u0072\u006b\u00e7\u0065| > |rus|rus|ru|Русский|\u0420\u0443\u0441\u0441\u043a\u0438\u0439| > |spa|spa|es|Español|\0045\u0073\u0070\u0061\u00f1\u006f\u006c| > > and after those initial four fields, all possible specific encodings > used for the language > > I thought such a list should be easy to find out there. > > Any lists of documentations you would suggest? > > lbrtchx >
Received on Friday, 27 September 2019 09:04:04 UTC