Re: telco this Friday

Hi Philipp, all,

a small re-write suggestion below. It covers three items:
1) language sub tags can contain codes from ISO 639-1, 2, 3 and 5, see 
http://tools.ietf.org/html/bcp47#section-2.2.1 the list following
  "Three-character primary language subtags in the IANA registry were 
defined according to the assignments found in one of these additional 
ISO 639 parts or assignments subsequently made by the relevant ISO 639 
registration authorities or governing standardization bodies:"
2) There are more sub tags than language and country, see 
http://tools.ietf.org/html/bcp47#section-2.1 : script, region, variant, 
extension, private use.
3) I added a link to 
http://www.w3.org/International/articles/language-tags/ which gives some 
guidance on how to work with language tags.

So here is the re-write suggestion.

When specifying the language of a literal, in this document we adhere to 
Best Common Practice 47 (http://www.rfc-editor.org/rfc/bcp/bcp47.txt). 
According to BCP 47, tags are made up of a language code (based on ISO 
639 codes part 1, 2, 3 or 5, see 
http://www.iso.org/iso/home/standards/language_codes.htm) optionally 
followed by a hyphen and a ISO 3166-1 country code 
(http://www.iso.org/iso/iso-3166-1_decoding_table.html). Language tags 
may also contain further subtags expressing e.g. the region, script or 
further variants. For an overview of BCP 47 language tags, see 
http://www.w3.org/International/articles/language-tags/
We follow the convention of writing the language codes in lower case and 
the country codes in upper case.
However, this is not part of the specification of this document; users 
of the lexicon-ontology model can adopt any strategy to specify the 
language, though we strongly recommend to follow BCP 47.

Best,

Felix

Am 06.02.14 20:30, schrieb Philipp Cimiano:
> Dear all,
>
> thanks for all your input to the language coding issue.
>
> I have now written the following in the document:
>
> When specifying the language of a literal, in this document we adhere 
> to to Best Common Practice 5646 
> (http://www.rfc-editor.org/rfc/bcp/bcp47.txt). According to BCP 5646, 
> tags are made up of a language code (a three letter ISO 639-3 code or 
> a two letter ISO 639-1 code if available, see 
> http://www.iso.org/iso/home/standards/language_codes.htm) followed by 
> a hyphen and a ISO 3166-1 country code 
> (http://www.iso.org/iso/iso-3166-1_decoding_table.html).
> We follow the convention of writing the language codes in lower case 
> and the country codes in upper case.
> However, this is not part of the specification of this document; users 
> of the lexicon-ontology model can adopt any strategy to specify the 
> language, though we strongly recommend to follow BCP 5646.
>
> I think this is in line with all your contributions.
>
> Let me know otherwise.
>
> Philipp.
>
> Am 30.01.14 12:23, schrieb Felix Sasaki:
>> Am 30.01.14 12:09, schrieb John P. McCrae:
>>>
>>>
>>>
>>> On Thu, Jan 30, 2014 at 7:47 AM, Philipp Cimiano 
>>> <cimiano@cit-ec.uni-bielefeld.de 
>>> <mailto:cimiano@cit-ec.uni-bielefeld.de>> wrote:
>>>
>>>     Dear all,
>>>
>>>      I am afraid I will not be able to attend the ontolex telco this
>>>     Friday. I will now work on the document, so please provide your
>>>     feedback by email.
>>>
>>>     I would kindly ask you all to work on the sections in the
>>>     document assigned to you ;-)
>>>
>>>     Other that that I wanted to clarify one issue regarding language
>>>     codes in the example.
>>>
>>>     I have seen that some people (John?) have started to use the ISO
>>>     639-2 codes (e.g. "ENG" for English, "SPA" for Spanish etc.).
>>>     I would propose we stick to the ISO 639-1 two-letter ISO 639-1
>>>     codes (e.g. "EN", "ES") etc. There is no particular reason for
>>>     this other than the fact that most people know these codes.
>>>
>>> Yes that would be me, I use the ISO 639-3 codes as they represent 
>>> the most complete and usable list of codes. At any rate, this is not 
>>> part of our standardization efforts and applications must support 
>>> well-formatted codes using any ISO standard
>>>
>>>
>>>     If the argument is recency and reusing the newest standard, then
>>>     we would have to go anyway for four letter codes according to
>>>     ISO 639-6.
>>>
>>> Erm 639-6 has a different purpose... it is not really appropriate 
>>> here (and is equal to 639-3 for standard languages anyway)
>>>
>>>
>>>     Regarding the particular versions of a language spoken in a
>>>     particular country, I recommend we follow the principle of IETF
>>>     tags which consists of the ISO code followed (if applicable) by
>>>     a hyphen and the ISO 3166-1 code of the country. Thus the
>>>     variation of English spoken
>>>     in the United States would be: "en-us" while the version of
>>>     English spoken in Great Britain would be "en-gb".
>>>
>>> There is a standard for this, namely RFC 5646
>>
>> Hi John, all,
>>
>> just to be picky, there is BCP 47 ("Best Common Practice") that 
>> defines language tags and matching of language tags. Various RFCs 
>> have been published about language tags, but the stable reference, 
>> that is "latest version" identifier for this, is always
>> http://www.rfc-editor.org/rfc/bcp/bcp47.txt
>> or in HTML http://tools.ietf.org/html/bcp47
>> currently it says "Request for Comments: 5646" at the top (the 
>> languge tag part) and RFC 4647 later (the matching part). You can 
>> find the previous RFCs by clickling on the "obsoletes" links, e.g. 
>> "Obsoletes: 4646 <http://tools.ietf.org/html/rfc4646> "
>>
>> - Felix
>>
>>> , and we should follow that as with all RDF. (It does agree with 
>>> your proposal here though)
>>>
>>> Regards,
>>> John
>>>
>>>
>>>     I hope this is fine for everyone. I will add this information to
>>>     the document.
>>>
>>>     Regards,
>>>
>>>     Philipp.
>>>
>>>     -- 
>>>
>>>     Prof. Dr. Philipp Cimiano
>>>
>>>     Phone: +49 521 106 12249 <tel:%2B49%20521%20106%2012249>
>>>     Fax: +49 521 106 12412 <tel:%2B49%20521%20106%2012412>
>>>     Mail: cimiano@cit-ec.uni-bielefeld.de
>>>     <mailto:cimiano@cit-ec.uni-bielefeld.de>
>>>
>>>     Forschungsbau Intelligente Systeme (FBIIS)
>>>     Raum 2.307
>>>     Universität Bielefeld
>>>     Inspiration 1
>>>     33619 Bielefeld
>>>
>>>
>>>
>>
>
>
> -- 
>
> Prof. Dr. Philipp Cimiano
>
> Phone: +49 521 106 12249
> Fax: +49 521 106 12412
> Mail:cimiano@cit-ec.uni-bielefeld.de
>
> Forschungsbau Intelligente Systeme (FBIIS)
> Raum 2.307
> Universität Bielefeld
> Inspiration 1
> 33619 Bielefeld

Received on Friday, 7 February 2014 06:32:48 UTC