Re: telco this Friday from Philipp Cimiano on 2014-02-07 (public-ontolex@w3.org from February 2014)

From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Date: Fri, 07 Feb 2014 07:42:31 +0100
To: public-ontolex@w3.org
Message-ID: <52F48057.6040202@cit-ec.uni-bielefeld.de>
Hi Felix,

  thanks for the contribution.

Philipp.

Am 07.02.14 07:32, schrieb Felix Sasaki:
> Hi Philipp, all,
>
> a small re-write suggestion below. It covers three items:
> 1) language sub tags can contain codes from ISO 639-1, 2, 3 and 5, see 
> http://tools.ietf.org/html/bcp47#section-2.2.1 the list following
>  "Three-character primary language subtags in the IANA registry were 
> defined according to the assignments found in one of these additional 
> ISO 639 parts or assignments subsequently made by the relevant ISO 639 
> registration authorities or governing standardization bodies:"
> 2) There are more sub tags than language and country, see 
> http://tools.ietf.org/html/bcp47#section-2.1 : script, region, 
> variant, extension, private use.
> 3) I added a link to 
> http://www.w3.org/International/articles/language-tags/ which gives 
> some guidance on how to work with language tags.
>
> So here is the re-write suggestion.
>
> When specifying the language of a literal, in this document we adhere 
> to Best Common Practice 47 
> (http://www.rfc-editor.org/rfc/bcp/bcp47.txt). According to BCP 47, 
> tags are made up of a language code (based on ISO 639 codes part 1, 2, 
> 3 or 5, see http://www.iso.org/iso/home/standards/language_codes.htm) 
> optionally followed by a hyphen and a ISO 3166-1 country code 
> (http://www.iso.org/iso/iso-3166-1_decoding_table.html). Language tags 
> may also contain further subtags expressing e.g. the region, script or 
> further variants. For an overview of BCP 47 language tags, see 
> http://www.w3.org/International/articles/language-tags/
> We follow the convention of writing the language codes in lower case 
> and the country codes in upper case.
> However, this is not part of the specification of this document; users 
> of the lexicon-ontology model can adopt any strategy to specify the 
> language, though we strongly recommend to follow BCP 47.
>
> Best,
>
> Felix
>
> Am 06.02.14 20:30, schrieb Philipp Cimiano:
>> Dear all,
>>
>> thanks for all your input to the language coding issue.
>>
>> I have now written the following in the document:
>>
>> When specifying the language of a literal, in this document we adhere 
>> to to Best Common Practice 5646 
>> (http://www.rfc-editor.org/rfc/bcp/bcp47.txt). According to BCP 5646, 
>> tags are made up of a language code (a three letter ISO 639-3 code or 
>> a two letter ISO 639-1 code if available, see 
>> http://www.iso.org/iso/home/standards/language_codes.htm) followed by 
>> a hyphen and a ISO 3166-1 country code 
>> (http://www.iso.org/iso/iso-3166-1_decoding_table.html).
>> We follow the convention of writing the language codes in lower case 
>> and the country codes in upper case.
>> However, this is not part of the specification of this document; 
>> users of the lexicon-ontology model can adopt any strategy to specify 
>> the language, though we strongly recommend to follow BCP 5646.
>>
>> I think this is in line with all your contributions.
>>
>> Let me know otherwise.
>>
>> Philipp.
>>
>> Am 30.01.14 12:23, schrieb Felix Sasaki:
>>> Am 30.01.14 12:09, schrieb John P. McCrae:
>>>>
>>>>
>>>>
>>>> On Thu, Jan 30, 2014 at 7:47 AM, Philipp Cimiano 
>>>> <cimiano@cit-ec.uni-bielefeld.de 
>>>> <mailto:cimiano@cit-ec.uni-bielefeld.de>> wrote:
>>>>
>>>>     Dear all,
>>>>
>>>>      I am afraid I will not be able to attend the ontolex telco
>>>>     this Friday. I will now work on the document, so please provide
>>>>     your feedback by email.
>>>>
>>>>     I would kindly ask you all to work on the sections in the
>>>>     document assigned to you ;-)
>>>>
>>>>     Other that that I wanted to clarify one issue regarding
>>>>     language codes in the example.
>>>>
>>>>     I have seen that some people (John?) have started to use the
>>>>     ISO 639-2 codes (e.g. "ENG" for English, "SPA" for Spanish etc.).
>>>>     I would propose we stick to the ISO 639-1 two-letter ISO 639-1
>>>>     codes (e.g. "EN", "ES") etc. There is no particular reason for
>>>>     this other than the fact that most people know these codes.
>>>>
>>>> Yes that would be me, I use the ISO 639-3 codes as they represent 
>>>> the most complete and usable list of codes. At any rate, this is 
>>>> not part of our standardization efforts and applications must 
>>>> support well-formatted codes using any ISO standard
>>>>
>>>>
>>>>     If the argument is recency and reusing the newest standard,
>>>>     then we would have to go anyway for four letter codes according
>>>>     to ISO 639-6.
>>>>
>>>> Erm 639-6 has a different purpose... it is not really appropriate 
>>>> here (and is equal to 639-3 for standard languages anyway)
>>>>
>>>>
>>>>     Regarding the particular versions of a language spoken in a
>>>>     particular country, I recommend we follow the principle of IETF
>>>>     tags which consists of the ISO code followed (if applicable) by
>>>>     a hyphen and the ISO 3166-1 code of the country. Thus the
>>>>     variation of English spoken
>>>>     in the United States would be: "en-us" while the version of
>>>>     English spoken in Great Britain would be "en-gb".
>>>>
>>>> There is a standard for this, namely RFC 5646
>>>
>>> Hi John, all,
>>>
>>> just to be picky, there is BCP 47 ("Best Common Practice") that 
>>> defines language tags and matching of language tags. Various RFCs 
>>> have been published about language tags, but the stable reference, 
>>> that is "latest version" identifier for this, is always
>>> http://www.rfc-editor.org/rfc/bcp/bcp47.txt
>>> or in HTML http://tools.ietf.org/html/bcp47
>>> currently it says "Request for Comments: 5646" at the top (the 
>>> languge tag part) and RFC 4647 later (the matching part). You can 
>>> find the previous RFCs by clickling on the "obsoletes" links, e.g. 
>>> "Obsoletes: 4646 <http://tools.ietf.org/html/rfc4646> "
>>>
>>> - Felix
>>>
>>>> , and we should follow that as with all RDF. (It does agree with 
>>>> your proposal here though)
>>>>
>>>> Regards,
>>>> John
>>>>
>>>>
>>>>     I hope this is fine for everyone. I will add this information
>>>>     to the document.
>>>>
>>>>     Regards,
>>>>
>>>>     Philipp.
>>>>
>>>>     -- 
>>>>
>>>>     Prof. Dr. Philipp Cimiano
>>>>
>>>>     Phone: +49 521 106 12249 <tel:%2B49%20521%20106%2012249>
>>>>     Fax: +49 521 106 12412 <tel:%2B49%20521%20106%2012412>
>>>>     Mail: cimiano@cit-ec.uni-bielefeld.de
>>>>     <mailto:cimiano@cit-ec.uni-bielefeld.de>
>>>>
>>>>     Forschungsbau Intelligente Systeme (FBIIS)
>>>>     Raum 2.307
>>>>     Universität Bielefeld
>>>>     Inspiration 1
>>>>     33619 Bielefeld
>>>>
>>>>
>>>>
>>>
>>
>>
>> -- 
>>
>> Prof. Dr. Philipp Cimiano
>>
>> Phone: +49 521 106 12249
>> Fax: +49 521 106 12412
>> Mail:cimiano@cit-ec.uni-bielefeld.de
>>
>> Forschungsbau Intelligente Systeme (FBIIS)
>> Raum 2.307
>> Universität Bielefeld
>> Inspiration 1
>> 33619 Bielefeld
>


-- 

Prof. Dr. Philipp Cimiano

Phone: +49 521 106 12249
Fax: +49 521 106 12412
Mail: cimiano@cit-ec.uni-bielefeld.de

Forschungsbau Intelligente Systeme (FBIIS)
Raum 2.307
Universität Bielefeld
Inspiration 1
33619 Bielefeld
Received on Friday, 7 February 2014 06:43:01 UTC