W3C home > Mailing lists > Public > www-international@w3.org > July to September 2019

Re: languages to encodings associations . . .

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Fri, 27 Sep 2019 09:03:38 +0000
To: Albretch Mueller <lbrtchx@gmail.com>, "www-international@w3.org" <www-international@w3.org>
Message-ID: <55b1f10c-3458-41e4-7a1b-6007b50043ac@it.aoyama.ac.jp>
Hello Albretch,

I cannot imagine what the use for a list like the one you describe would 
actually be. Maybe you can clarify.

I have seen attempts at creating such lists in the 1990ies. But these 
days, the best advice, as you mention, is "just use UTF-8".

Regards,   Martin.

On 2019/09/26 20:09, Albretch Mueller wrote:
>   I have found lists  based on the ISO 639 such as those used by the US
> Library of Congress, which contain the ISO 639-1, 2 and 3 codes for
> the representation of names of languages, but I am not able to find a
> languages to encodings associations list. For example, even if helpful
> towards my goal, this one:
>   https://docs.python.org/2/library/codecs.html

>   Doesn't really give you a language-encodings association and
> languages such as the second and fourth most spoken by # of native
> speakers (Spanish and Hindi) are not listed.
>   UTF-8 could be used to encode any language but that is not so with
> all other encodings.
>   Basically what I have in mind is some data looking like:
>   ISO-639-3|ISO-639-2|ISO-639-1|Name of language|Name of language as
> java " \uffff" unicode format|all encodings that can be used with that
> language.
>   Example, these would be the first 5 fields of three languages:
> |tur|tur|tr|Türkçe|\u0054\u00fc\u0072\u006b\u00e7\u0065|
> |rus|rus|ru|Русский|\u0420\u0443\u0441\u0441\u043a\u0438\u0439|
> |spa|spa|es|Español|\0045\u0073\u0070\u0061\u00f1\u006f\u006c|
>   and after those initial four fields, all possible specific encodings
> used for the language
>   I thought such a list should be easy to find out there.
>   Any lists of documentations you would suggest?
>   lbrtchx
Received on Friday, 27 September 2019 09:04:04 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:41:15 UTC