- From: Gil Francopoulo <gil.francopoulo@wanadoo.fr>
- Date: Thu, 30 Jan 2014 09:12:04 +0100
- To: public-ontolex@w3.org
- Message-ID: <52EA0954.8010004@wanadoo.fr>
Dear Philip and Lars, I agree with Lars. I suggest to take a look (and follow) IETF BCP 47 in the examples, where: * a language code is never in upper-case but in lower-case, * a country code is always in upper-case and respects ISO-3166-1 * this is to allow combination like eng (when any detail is not needed) but permits precisions like eng-US or eng-UK. * to follow ISO-639-3 to access to a larger range of values than ISO-639-1 * IMHO nobody follow ISO-639-2 nowadays (it was a sort of wrong trial) * ISO-639-6 is not used Hoping that helps, Gil Le 30/01/2014 08:44, Lars Borin a écrit : > Dear all, > >> >> >> Other that that I wanted to clarify one issue regarding language >> codes in the example. >> >> I have seen that some people (John?) have started to use the ISO >> 639-2 codes (e.g. "ENG" for English, "SPA" for Spanish etc.). >> I would propose we stick to the ISO 639-1 two-letter ISO 639-1 >> codes (e.g. "EN", "ES") etc. There is no particular reason for >> this other than the fact that most people know these codes. >> >> If the argument is recency and reusing the newest standard, then >> we would have to go anyway for four letter codes according to ISO >> 639-6. >> >> >> In the open mulitlingual wordnet we use the three letter codes >> because there are people working on languages which do not have two >> letter codes, such as Abui (abz), Minangkabau (min) or Cantonese >> (yue). Note that some of these are large language communities, >> Minangkabauhas around 6 million speakers. I think this is a strong >> argument for not going back to the two letter codes. > > I suspect that the three-letter codes in question are intended to be > ISO 639-3 (and not 639-2), the use of which is pretty much best > practice in linguistics today (even if there is quite a bit of > discussion about how well it reflects lingusitic descriptive practice > and actual reality; see, e.g., <http://dlc.hypotheses.org/610>), > because of coverage (not even all the languages of Europe are covered > by 639-1, e.g. the two Sorbian languages) and because of granularity: > The "language" level of ISO 639-3 (basically that of the Ethnologue) > will not be included in 639-6, so there won't be a way of saying > "English", since 639-3 already provides one, but you will be able to > say (or, rather, propose codes for), e.g., "Elizabethan English", > "Modern Australian English", etc. > > Best > Lars > > -- > «Null hull,» sa Harry | – Bögga? sagði Erlendur. Er það orð? | > (Jo Nesbø: Kakerlakkene) | (Arnaldur Indriðason: Mýrin) | > -- > Se aikainen matohan nokitaan! > (Reijo Mäki: Uhkapelimerkki) > ---- > Lars Borin > Språkbanken • Centre for Language Technology > Institutionen för svenska språket > Göteborgs universitet > Box 200 > SE-405 30 Göteborg > Sweden > > office +46 (0)31 786 4544 > mobile +46 (0)70 747 8386 > > <http://språkbanken.gu.se/personal/lars/>
Received on Thursday, 30 January 2014 08:12:33 UTC