Re: telco this Friday

Dear Philip and Lars,

I agree with Lars.

I suggest to take a look (and follow) IETF BCP 47 in the examples, where:

* a language code is never in upper-case but in lower-case,
* a country code is always in upper-case and respects ISO-3166-1
* this is to allow combination like eng (when any detail is not needed) 
but permits precisions like eng-US or eng-UK.
* to follow ISO-639-3 to access to a larger range of values than ISO-639-1
* IMHO nobody follow ISO-639-2 nowadays (it was a sort of wrong trial)
* ISO-639-6 is not used

Hoping that helps,
Gil


Le 30/01/2014 08:44, Lars Borin a écrit :
> Dear all,
>
>>
>>
>>     Other that that I wanted to clarify one issue regarding language
>>     codes in the example.
>>
>>     I have seen that some people (John?) have started to use the ISO
>>     639-2 codes (e.g. "ENG" for English, "SPA" for Spanish etc.).
>>     I would propose we stick to the ISO 639-1 two-letter ISO 639-1
>>     codes (e.g. "EN", "ES") etc. There is no particular reason for
>>     this other than the fact that most people know these codes.
>>
>>     If the argument is recency and reusing the newest standard, then
>>     we would have to go anyway for four letter codes according to ISO
>>     639-6.
>>
>>
>> In the open mulitlingual wordnet we use the three letter codes 
>> because there are people working on languages which do not have two 
>> letter codes, such as Abui (abz),  Minangkabau (min) or Cantonese 
>> (yue).  Note that some of these are large language communities, 
>> Minangkabauhas around 6 million speakers. I think this is a strong 
>> argument for not going back to the two letter codes.
>
> I suspect that the three-letter codes in question are intended to be 
> ISO 639-3 (and not 639-2), the use of which is pretty much best 
> practice in linguistics today (even if there is quite a bit of 
> discussion about how well it reflects lingusitic descriptive practice 
> and actual reality; see, e.g., <http://dlc.hypotheses.org/610>), 
> because of coverage (not even all the languages of Europe are covered 
> by 639-1, e.g. the two Sorbian languages) and because of granularity: 
> The "language" level of ISO 639-3 (basically that of the Ethnologue) 
> will not be included in 639-6, so there won't be a way of saying 
> "English", since 639-3 already provides one, but you will be able to 
> say (or, rather, propose codes for), e.g., "Elizabethan English", 
> "Modern Australian English", etc.
>
> Best
> Lars
>
> -- 
> «Null hull,» sa Harry    | – Bögga? sagði Erlendur. Er það orð? |
> (Jo Nesbø: Kakerlakkene) | (Arnaldur Indriðason: Mýrin)         |
> --
> Se aikainen matohan nokitaan!
> (Reijo Mäki: Uhkapelimerkki)
> ----
> Lars Borin
> Språkbanken • Centre for Language Technology
> Institutionen för svenska språket
> Göteborgs universitet
> Box 200
> SE-405 30 Göteborg
> Sweden
>
> office +46 (0)31 786 4544
> mobile +46 (0)70 747 8386
>
> <http://språkbanken.gu.se/personal/lars/>

Received on Thursday, 30 January 2014 08:12:33 UTC