- From: Albretch Mueller <lbrtchx@gmail.com>
- Date: Thu, 26 Sep 2019 13:09:50 +0200
- To: www-international@w3.org
I have found lists based on the ISO 639 such as those used by the US Library of Congress, which contain the ISO 639-1, 2 and 3 codes for the representation of names of languages, but I am not able to find a languages to encodings associations list. For example, even if helpful towards my goal, this one: https://docs.python.org/2/library/codecs.html Doesn't really give you a language-encodings association and languages such as the second and fourth most spoken by # of native speakers (Spanish and Hindi) are not listed. UTF-8 could be used to encode any language but that is not so with all other encodings. Basically what I have in mind is some data looking like: ISO-639-3|ISO-639-2|ISO-639-1|Name of language|Name of language as java " \uffff" unicode format|all encodings that can be used with that language. Example, these would be the first 5 fields of three languages: |tur|tur|tr|Türkçe|\u0054\u00fc\u0072\u006b\u00e7\u0065| |rus|rus|ru|Русский|\u0420\u0443\u0441\u0441\u043a\u0438\u0439| |spa|spa|es|Español|\0045\u0073\u0070\u0061\u00f1\u006f\u006c| and after those initial four fields, all possible specific encodings used for the language I thought such a list should be easy to find out there. Any lists of documentations you would suggest? lbrtchx
Received on Thursday, 26 September 2019 11:10:14 UTC