- From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
- Date: Fri, 07 Feb 2014 07:42:31 +0100
- To: public-ontolex@w3.org
- Message-ID: <52F48057.6040202@cit-ec.uni-bielefeld.de>
Hi Felix,
thanks for the contribution.
Philipp.
Am 07.02.14 07:32, schrieb Felix Sasaki:
> Hi Philipp, all,
>
> a small re-write suggestion below. It covers three items:
> 1) language sub tags can contain codes from ISO 639-1, 2, 3 and 5, see
> http://tools.ietf.org/html/bcp47#section-2.2.1 the list following
> "Three-character primary language subtags in the IANA registry were
> defined according to the assignments found in one of these additional
> ISO 639 parts or assignments subsequently made by the relevant ISO 639
> registration authorities or governing standardization bodies:"
> 2) There are more sub tags than language and country, see
> http://tools.ietf.org/html/bcp47#section-2.1 : script, region,
> variant, extension, private use.
> 3) I added a link to
> http://www.w3.org/International/articles/language-tags/ which gives
> some guidance on how to work with language tags.
>
> So here is the re-write suggestion.
>
> When specifying the language of a literal, in this document we adhere
> to Best Common Practice 47
> (http://www.rfc-editor.org/rfc/bcp/bcp47.txt). According to BCP 47,
> tags are made up of a language code (based on ISO 639 codes part 1, 2,
> 3 or 5, see http://www.iso.org/iso/home/standards/language_codes.htm)
> optionally followed by a hyphen and a ISO 3166-1 country code
> (http://www.iso.org/iso/iso-3166-1_decoding_table.html). Language tags
> may also contain further subtags expressing e.g. the region, script or
> further variants. For an overview of BCP 47 language tags, see
> http://www.w3.org/International/articles/language-tags/
> We follow the convention of writing the language codes in lower case
> and the country codes in upper case.
> However, this is not part of the specification of this document; users
> of the lexicon-ontology model can adopt any strategy to specify the
> language, though we strongly recommend to follow BCP 47.
>
> Best,
>
> Felix
>
> Am 06.02.14 20:30, schrieb Philipp Cimiano:
>> Dear all,
>>
>> thanks for all your input to the language coding issue.
>>
>> I have now written the following in the document:
>>
>> When specifying the language of a literal, in this document we adhere
>> to to Best Common Practice 5646
>> (http://www.rfc-editor.org/rfc/bcp/bcp47.txt). According to BCP 5646,
>> tags are made up of a language code (a three letter ISO 639-3 code or
>> a two letter ISO 639-1 code if available, see
>> http://www.iso.org/iso/home/standards/language_codes.htm) followed by
>> a hyphen and a ISO 3166-1 country code
>> (http://www.iso.org/iso/iso-3166-1_decoding_table.html).
>> We follow the convention of writing the language codes in lower case
>> and the country codes in upper case.
>> However, this is not part of the specification of this document;
>> users of the lexicon-ontology model can adopt any strategy to specify
>> the language, though we strongly recommend to follow BCP 5646.
>>
>> I think this is in line with all your contributions.
>>
>> Let me know otherwise.
>>
>> Philipp.
>>
>> Am 30.01.14 12:23, schrieb Felix Sasaki:
>>> Am 30.01.14 12:09, schrieb John P. McCrae:
>>>>
>>>>
>>>>
>>>> On Thu, Jan 30, 2014 at 7:47 AM, Philipp Cimiano
>>>> <cimiano@cit-ec.uni-bielefeld.de
>>>> <mailto:cimiano@cit-ec.uni-bielefeld.de>> wrote:
>>>>
>>>> Dear all,
>>>>
>>>> I am afraid I will not be able to attend the ontolex telco
>>>> this Friday. I will now work on the document, so please provide
>>>> your feedback by email.
>>>>
>>>> I would kindly ask you all to work on the sections in the
>>>> document assigned to you ;-)
>>>>
>>>> Other that that I wanted to clarify one issue regarding
>>>> language codes in the example.
>>>>
>>>> I have seen that some people (John?) have started to use the
>>>> ISO 639-2 codes (e.g. "ENG" for English, "SPA" for Spanish etc.).
>>>> I would propose we stick to the ISO 639-1 two-letter ISO 639-1
>>>> codes (e.g. "EN", "ES") etc. There is no particular reason for
>>>> this other than the fact that most people know these codes.
>>>>
>>>> Yes that would be me, I use the ISO 639-3 codes as they represent
>>>> the most complete and usable list of codes. At any rate, this is
>>>> not part of our standardization efforts and applications must
>>>> support well-formatted codes using any ISO standard
>>>>
>>>>
>>>> If the argument is recency and reusing the newest standard,
>>>> then we would have to go anyway for four letter codes according
>>>> to ISO 639-6.
>>>>
>>>> Erm 639-6 has a different purpose... it is not really appropriate
>>>> here (and is equal to 639-3 for standard languages anyway)
>>>>
>>>>
>>>> Regarding the particular versions of a language spoken in a
>>>> particular country, I recommend we follow the principle of IETF
>>>> tags which consists of the ISO code followed (if applicable) by
>>>> a hyphen and the ISO 3166-1 code of the country. Thus the
>>>> variation of English spoken
>>>> in the United States would be: "en-us" while the version of
>>>> English spoken in Great Britain would be "en-gb".
>>>>
>>>> There is a standard for this, namely RFC 5646
>>>
>>> Hi John, all,
>>>
>>> just to be picky, there is BCP 47 ("Best Common Practice") that
>>> defines language tags and matching of language tags. Various RFCs
>>> have been published about language tags, but the stable reference,
>>> that is "latest version" identifier for this, is always
>>> http://www.rfc-editor.org/rfc/bcp/bcp47.txt
>>> or in HTML http://tools.ietf.org/html/bcp47
>>> currently it says "Request for Comments: 5646" at the top (the
>>> languge tag part) and RFC 4647 later (the matching part). You can
>>> find the previous RFCs by clickling on the "obsoletes" links, e.g.
>>> "Obsoletes: 4646 <http://tools.ietf.org/html/rfc4646> "
>>>
>>> - Felix
>>>
>>>> , and we should follow that as with all RDF. (It does agree with
>>>> your proposal here though)
>>>>
>>>> Regards,
>>>> John
>>>>
>>>>
>>>> I hope this is fine for everyone. I will add this information
>>>> to the document.
>>>>
>>>> Regards,
>>>>
>>>> Philipp.
>>>>
>>>> --
>>>>
>>>> Prof. Dr. Philipp Cimiano
>>>>
>>>> Phone: +49 521 106 12249 <tel:%2B49%20521%20106%2012249>
>>>> Fax: +49 521 106 12412 <tel:%2B49%20521%20106%2012412>
>>>> Mail: cimiano@cit-ec.uni-bielefeld.de
>>>> <mailto:cimiano@cit-ec.uni-bielefeld.de>
>>>>
>>>> Forschungsbau Intelligente Systeme (FBIIS)
>>>> Raum 2.307
>>>> Universität Bielefeld
>>>> Inspiration 1
>>>> 33619 Bielefeld
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>>
>> Prof. Dr. Philipp Cimiano
>>
>> Phone: +49 521 106 12249
>> Fax: +49 521 106 12412
>> Mail:cimiano@cit-ec.uni-bielefeld.de
>>
>> Forschungsbau Intelligente Systeme (FBIIS)
>> Raum 2.307
>> Universität Bielefeld
>> Inspiration 1
>> 33619 Bielefeld
>
--
Prof. Dr. Philipp Cimiano
Phone: +49 521 106 12249
Fax: +49 521 106 12412
Mail: cimiano@cit-ec.uni-bielefeld.de
Forschungsbau Intelligente Systeme (FBIIS)
Raum 2.307
Universität Bielefeld
Inspiration 1
33619 Bielefeld
Received on Friday, 7 February 2014 06:43:01 UTC