Re: Language Tag Case Conflict (between RDF1.1 and BCP47)

Hi Hong Sun and Ivan,

Do I understand correctly that these comments apply to Turtle?

Regards,
Dave
--
http://about.me/david_wood



On Mar 29, 2013, at 06:09, Ivan Herman <ivan@w3.org> wrote:

> Hong Sun does not have the right credentials to send the mail to the WG mailing list, so I forward this to the comment list for processing and archiving!
> 
> Hong Sun, thank you.
> 
> Ivan
> 
> Begin forwarded message:
> 
>> From: Hong Sun <hong.sun@agfa.com>
>> Subject: [Moderator Action] Language Tag Case Conflict (between RDF1.1 and BCP47)
>> Date: March 29, 2013 10:43:18 GMT+01:00
>> To: public-rdf-wg@w3.org
>> 
>> Dear All, 
>> 
>> I am working on processing text with language tag, but reading the RDF 1.1 specification, I found there is a conflict in choosing the case for a language tag. 
>> 
>> In RDF1.1 
>> http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/#dfn-language-tag 
>> It is stated 
>> """ 
>> a non-empty language tag as defined by [BCP47]. The language tag must be well-formed according to section 2.2.9 of [BCP47], and must be normalized to lowercase. 
>> """ 
>> which is together with the following example: 
>> show:218 show:localName "Cette Série des Années Septante"@fr-be .  # literal with a region subtag 
>> 
>> 
>> But taking a look at BCP47 
>> http://tools.ietf.org/html/bcp47#section-2.2.9        , it states 
>> """ 
>> For example, one might use a tag such as "no-QQ", where 'QQ' 
>>    is one of a range of private use ISO 3166-1 codes to indicate an 
>>    otherwise undefined region. 
>> """ 
>> 
>> An even more clear recommendation is given in this document in 
>> http://tools.ietf.org/html/bcp47#section-2.1.1 
>> """ 
>> All subtags, including extension and private 
>>    use subtags, use lowercase letters with two exceptions: two-letter 
>>    and four-letter subtags that neither appear at the start of the tag 
>>    nor occur after singletons.  Such two-letter subtags are all 
>>    uppercase (as in the tags "en-CA-x-ca" or "sgn-BE-FR") and four- 
>>    letter subtags are titlecase (as in the tag "az-Latn-x-latn"). 
>> """ 
>> 
>> In short, it seems that: 
>> according to RDF1.1, we should uses de-ch,   
>> in BCP47, it recommends to use de-CH, 
>> and meanwhile RDF1.1 also states language tag must be well-formed according to [BCP47]. 
>> 
>> So now there is a conflict, and which exactly should we use? 
>> 
>> 
>> In addition, in the other specifictions, Turtle does not care the case, while N3 now also use lower case for sub-tag, e.g. de-ch. 
>> 
>> http://www.w3.org/2000/10/swap/grammar/n3.n3 
>> """ 
>> # was: "[a-zA-Z][a-zA-Z0-9]*(-[a-zA-Z0-9]+)?"; 
>> langcode        cfg:matches          "[a-z]+(-[a-z0-9]+)*"; # http://www.w3.org/TR/rdf-testcases/#language 
>>                 cfg:canStartWith         "a". 
>> """ 
>> 
>> Is it possible to treat the language tag as case-insensitive? As Andy Seaborne suggested in http://lists.w3.org/Archives/Public/public-rdf-wg/2013Feb/0275.html 
>> 
>> Thanks! 
>> 
>> Kind Regards,
>> 
>> Hong Sun | Agfa HealthCare
>> Researcher | HE/Advanced Clinical Applications Research
>> T  +32 3444 8108
>> 
>> http://www.agfahealthcare.com
>> http://blog.agfahealthcare.com
>> Click on link to read important disclaimer: http://www.agfahealthcare.com/maildisclaimer
> 
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
> 

Received on Friday, 29 March 2013 15:13:48 UTC