Re: languages to encodings associations . . .

On 9/27/2019 2:03 AM, Martin J. Dürst wrote:
> Hello Albretch,
>
> I cannot imagine what the use for a list like the one you describe would
> actually be. Maybe you can clarify.
>
> I have seen attempts at creating such lists in the 1990ies. But these
> days, the best advice, as you mention, is "just use UTF-8".
>
> Regards,   Martin.

+1

Was about to write the same. Martin saved me the trouble,

A./

>
> On 2019/09/26 20:09, Albretch Mueller wrote:
>>    I have found lists  based on the ISO 639 such as those used by the US
>> Library of Congress, which contain the ISO 639-1, 2 and 3 codes for
>> the representation of names of languages, but I am not able to find a
>> languages to encodings associations list. For example, even if helpful
>> towards my goal, this one:
>>
>>    https://docs.python.org/2/library/codecs.html
>>
>>    Doesn't really give you a language-encodings association and
>> languages such as the second and fourth most spoken by # of native
>> speakers (Spanish and Hindi) are not listed.
>>
>>    UTF-8 could be used to encode any language but that is not so with
>> all other encodings.
>>
>>    Basically what I have in mind is some data looking like:
>>
>>    ISO-639-3|ISO-639-2|ISO-639-1|Name of language|Name of language as
>> java " \uffff" unicode format|all encodings that can be used with that
>> language.
>>
>>    Example, these would be the first 5 fields of three languages:
>>
>> |tur|tur|tr|Türkçe|\u0054\u00fc\u0072\u006b\u00e7\u0065|
>> |rus|rus|ru|Русский|\u0420\u0443\u0441\u0441\u043a\u0438\u0439|
>> |spa|spa|es|Español|\0045\u0073\u0070\u0061\u00f1\u006f\u006c|
>>
>>    and after those initial four fields, all possible specific encodings
>> used for the language
>>
>>    I thought such a list should be easy to find out there.
>>
>>    Any lists of documentations you would suggest?
>>
>>    lbrtchx
>>

Received on Friday, 27 September 2019 09:59:54 UTC