- From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
- Date: Fri, 07 Feb 2014 07:42:31 +0100
- To: public-ontolex@w3.org
- Message-ID: <52F48057.6040202@cit-ec.uni-bielefeld.de>
Hi Felix, thanks for the contribution. Philipp. Am 07.02.14 07:32, schrieb Felix Sasaki: > Hi Philipp, all, > > a small re-write suggestion below. It covers three items: > 1) language sub tags can contain codes from ISO 639-1, 2, 3 and 5, see > http://tools.ietf.org/html/bcp47#section-2.2.1 the list following > "Three-character primary language subtags in the IANA registry were > defined according to the assignments found in one of these additional > ISO 639 parts or assignments subsequently made by the relevant ISO 639 > registration authorities or governing standardization bodies:" > 2) There are more sub tags than language and country, see > http://tools.ietf.org/html/bcp47#section-2.1 : script, region, > variant, extension, private use. > 3) I added a link to > http://www.w3.org/International/articles/language-tags/ which gives > some guidance on how to work with language tags. > > So here is the re-write suggestion. > > When specifying the language of a literal, in this document we adhere > to Best Common Practice 47 > (http://www.rfc-editor.org/rfc/bcp/bcp47.txt). According to BCP 47, > tags are made up of a language code (based on ISO 639 codes part 1, 2, > 3 or 5, see http://www.iso.org/iso/home/standards/language_codes.htm) > optionally followed by a hyphen and a ISO 3166-1 country code > (http://www.iso.org/iso/iso-3166-1_decoding_table.html). Language tags > may also contain further subtags expressing e.g. the region, script or > further variants. For an overview of BCP 47 language tags, see > http://www.w3.org/International/articles/language-tags/ > We follow the convention of writing the language codes in lower case > and the country codes in upper case. > However, this is not part of the specification of this document; users > of the lexicon-ontology model can adopt any strategy to specify the > language, though we strongly recommend to follow BCP 47. > > Best, > > Felix > > Am 06.02.14 20:30, schrieb Philipp Cimiano: >> Dear all, >> >> thanks for all your input to the language coding issue. >> >> I have now written the following in the document: >> >> When specifying the language of a literal, in this document we adhere >> to to Best Common Practice 5646 >> (http://www.rfc-editor.org/rfc/bcp/bcp47.txt). According to BCP 5646, >> tags are made up of a language code (a three letter ISO 639-3 code or >> a two letter ISO 639-1 code if available, see >> http://www.iso.org/iso/home/standards/language_codes.htm) followed by >> a hyphen and a ISO 3166-1 country code >> (http://www.iso.org/iso/iso-3166-1_decoding_table.html). >> We follow the convention of writing the language codes in lower case >> and the country codes in upper case. >> However, this is not part of the specification of this document; >> users of the lexicon-ontology model can adopt any strategy to specify >> the language, though we strongly recommend to follow BCP 5646. >> >> I think this is in line with all your contributions. >> >> Let me know otherwise. >> >> Philipp. >> >> Am 30.01.14 12:23, schrieb Felix Sasaki: >>> Am 30.01.14 12:09, schrieb John P. McCrae: >>>> >>>> >>>> >>>> On Thu, Jan 30, 2014 at 7:47 AM, Philipp Cimiano >>>> <cimiano@cit-ec.uni-bielefeld.de >>>> <mailto:cimiano@cit-ec.uni-bielefeld.de>> wrote: >>>> >>>> Dear all, >>>> >>>> I am afraid I will not be able to attend the ontolex telco >>>> this Friday. I will now work on the document, so please provide >>>> your feedback by email. >>>> >>>> I would kindly ask you all to work on the sections in the >>>> document assigned to you ;-) >>>> >>>> Other that that I wanted to clarify one issue regarding >>>> language codes in the example. >>>> >>>> I have seen that some people (John?) have started to use the >>>> ISO 639-2 codes (e.g. "ENG" for English, "SPA" for Spanish etc.). >>>> I would propose we stick to the ISO 639-1 two-letter ISO 639-1 >>>> codes (e.g. "EN", "ES") etc. There is no particular reason for >>>> this other than the fact that most people know these codes. >>>> >>>> Yes that would be me, I use the ISO 639-3 codes as they represent >>>> the most complete and usable list of codes. At any rate, this is >>>> not part of our standardization efforts and applications must >>>> support well-formatted codes using any ISO standard >>>> >>>> >>>> If the argument is recency and reusing the newest standard, >>>> then we would have to go anyway for four letter codes according >>>> to ISO 639-6. >>>> >>>> Erm 639-6 has a different purpose... it is not really appropriate >>>> here (and is equal to 639-3 for standard languages anyway) >>>> >>>> >>>> Regarding the particular versions of a language spoken in a >>>> particular country, I recommend we follow the principle of IETF >>>> tags which consists of the ISO code followed (if applicable) by >>>> a hyphen and the ISO 3166-1 code of the country. Thus the >>>> variation of English spoken >>>> in the United States would be: "en-us" while the version of >>>> English spoken in Great Britain would be "en-gb". >>>> >>>> There is a standard for this, namely RFC 5646 >>> >>> Hi John, all, >>> >>> just to be picky, there is BCP 47 ("Best Common Practice") that >>> defines language tags and matching of language tags. Various RFCs >>> have been published about language tags, but the stable reference, >>> that is "latest version" identifier for this, is always >>> http://www.rfc-editor.org/rfc/bcp/bcp47.txt >>> or in HTML http://tools.ietf.org/html/bcp47 >>> currently it says "Request for Comments: 5646" at the top (the >>> languge tag part) and RFC 4647 later (the matching part). You can >>> find the previous RFCs by clickling on the "obsoletes" links, e.g. >>> "Obsoletes: 4646 <http://tools.ietf.org/html/rfc4646> " >>> >>> - Felix >>> >>>> , and we should follow that as with all RDF. (It does agree with >>>> your proposal here though) >>>> >>>> Regards, >>>> John >>>> >>>> >>>> I hope this is fine for everyone. I will add this information >>>> to the document. >>>> >>>> Regards, >>>> >>>> Philipp. >>>> >>>> -- >>>> >>>> Prof. Dr. Philipp Cimiano >>>> >>>> Phone: +49 521 106 12249 <tel:%2B49%20521%20106%2012249> >>>> Fax: +49 521 106 12412 <tel:%2B49%20521%20106%2012412> >>>> Mail: cimiano@cit-ec.uni-bielefeld.de >>>> <mailto:cimiano@cit-ec.uni-bielefeld.de> >>>> >>>> Forschungsbau Intelligente Systeme (FBIIS) >>>> Raum 2.307 >>>> Universität Bielefeld >>>> Inspiration 1 >>>> 33619 Bielefeld >>>> >>>> >>>> >>> >> >> >> -- >> >> Prof. Dr. Philipp Cimiano >> >> Phone: +49 521 106 12249 >> Fax: +49 521 106 12412 >> Mail:cimiano@cit-ec.uni-bielefeld.de >> >> Forschungsbau Intelligente Systeme (FBIIS) >> Raum 2.307 >> Universität Bielefeld >> Inspiration 1 >> 33619 Bielefeld > -- Prof. Dr. Philipp Cimiano Phone: +49 521 106 12249 Fax: +49 521 106 12412 Mail: cimiano@cit-ec.uni-bielefeld.de Forschungsbau Intelligente Systeme (FBIIS) Raum 2.307 Universität Bielefeld Inspiration 1 33619 Bielefeld
Received on Friday, 7 February 2014 06:43:01 UTC