RE: text, lowercase language tags

Hello,

As far as I know, there we no extensive discussions on this point, so thanks for
starting one.

I would actually prefer option 1. Normalizing the value space to lowercase makes
sense from the OWL point of view (we clearly don't want "abc"@en and "abc"@EN to
be distinct objects that might cause a violation of some cardinality
constraint). I also see now problem to having mixed-case lexical forms.

In fact, the older version of the document followed this approach. I changed
this recently, however, in desire to be compatible with RDF. After all, we got
some potentially show-stopping comments by RDF people, so I thought to preempt
these this time around.

How shall we go about resolving this? Would it be possible to check with the RDF
people whether they'd be OK with option 1?

Regards,

	Boris

> -----Original Message-----
> From: public-rdf-text-request@w3.org [mailto:public-rdf-text-request@w3.org]
> On Behalf Of Sandro Hawke
> Sent: 06 April 2009 17:46
> To: public-rdf-text@w3.org
> Subject: rdf:text, lowercase language tags
> 
> 
> In reviewing rdf:text today and in today's e-mails, I see that rdf:text
> disallows mixed-case language tags, without explanation.  Given that
> BCP-47 says that folks SHOULD use mixed-case tags:
> 
>    Although case distinctions do not carry meaning in language tags,
>    consistent formatting and presentation of the tags will aid users.
>    The format of the tags and subtags in the registry is RECOMMENDED.
>    In this format, all non-initial two-letter subtags are uppercase, all
>    non-initial four-letter subtags are titlecase, and all other subtags
>    are lowercase.
> 
> ... I find this a little awkward.
> 
> Also, note that the example of xml:lang in RDF/XML Syntax [1] uses a
> mixed-case tag:
> 
>     <dc:title xml:lang="en-US">RDF/XML Syntax Specification
> (Revised)</dc:title>
> 
> Now, I understand that N-Triples requires lowercase, and RDF says [2]:
> 
>      Plain literals have a lexical form and optionally a language tag as
>      defined by [RFC-3066], normalized to lowercase.
> 
> but I think there's still an awkward conflict here.
> 
> Several options:
> 
>   1.  Change the lexical form for rdf:text to allow mixed case, but have
>   value space for language tags be lower case.  Make the canonical
>   lexical representation follow the rules BCP-47 states, which I quoted
>   above.
> 
>   2.  Change the lexical form and the value space to both allow mixed
>   case, but have comparison ignore case.  Unfortunately, this means
>   "a@en-us" owl:sameAs "a@en-US" would be false (I think), so this seems
>   unacceptable.
> 
>   3.  Stick with the current design, but include a note explaining why
>   we're not allowing people to follow BCP-47, or why it doesn't matter
>   (this is internal or something).
> 
>   4.  No change.
> 
> My apologies if I missed some earlier discussion convering these points.
> 
>      -- Sandro
> 
> 
> [1] http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/#example8
> [2] http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-language-
> identifier

Received on Monday, 6 April 2009 17:54:02 UTC