- From: Christian Chiarcos <chiarcos@informatik.uni-frankfurt.de>
- Date: Tue, 20 Dec 2016 18:18:34 +0100
- To: "semantic-web@w3.org Web" <semantic-web@w3.org>, "Mario Valle" <mvalle@cscs.ch>
- Cc: "christian.chiarcos@web.de" <christian.chiarcos@web.de>
Dear Mario, > In Turtle syntax the @lang tag syntax refers to BCP47 that states: > > language = 2*3ALPHA ; shortest ISO 639 code > > That is, the language code (I ignore all the variants here) should be 2 > or 3 characters. This means you should use the two-letter code for a language that has one (@en) even if it does have a three-letter code (@eng). Not every language does have a two-letter code. > Indeed ISO 639 (http://www.loc.gov/standards/iso639-2/php/code_list.php) > lists both 2 and 3 chars codes (e.g., English: 'en' and 'eng'). > > But in all Turtle examples I have found the language code has 2 chars. > Is it a requirement or is simply a tradition? This means, could I write > "Pancake"@eng? > > The question arises because WordNet contains 3 chars codes, so to > transform into triples, should/shouldn't I convert it to 2 characters? The reason is that the 2-character codes are insufficient from the perspective of multilingual NLP or linguistics where ISO 639-3 is much more established (and somewhat better defined) than ISO 639-1 2-letter codes. Therefore, people developing language resources (like WordNet) sometimes tend to neglect ISO 639-1 codes altogether. I also went that way at times. In terms of BCP47, however, this is a mistake and should be fixed. As long as you work with modern-day major languages only and you don't see issues with the 2-letter codes for your task/resource, you should definitely follow BCP47 and use 2-letter codes wherever possible. Best, Christian > > Thanks for your patience > > mario > -- Prof. Dr. Christian Chiarcos Applied Computational Linguistics Johann Wolfgang Goethe Universität Frankfurt a. M. 60054 Frankfurt am Main, Germany office: Robert-Mayer-Str. 10, #401b mail: chiarcos@informatik.uni-frankfurt.de web: http://acoli.cs.uni-frankfurt.de tel: +49-(0)69-798-22463 fax: +49-(0)69-798-28931
Received on Tuesday, 20 December 2016 17:19:15 UTC