rdf:text, lowercase language tags

In reviewing rdf:text today and in today's e-mails, I see that rdf:text
disallows mixed-case language tags, without explanation.  Given that
BCP-47 says that folks SHOULD use mixed-case tags:

   Although case distinctions do not carry meaning in language tags,
   consistent formatting and presentation of the tags will aid users.
   The format of the tags and subtags in the registry is RECOMMENDED.
   In this format, all non-initial two-letter subtags are uppercase, all
   non-initial four-letter subtags are titlecase, and all other subtags
   are lowercase.

... I find this a little awkward.

Also, note that the example of xml:lang in RDF/XML Syntax [1] uses a
mixed-case tag:

    <dc:title xml:lang="en-US">RDF/XML Syntax Specification (Revised)</dc:title>

Now, I understand that N-Triples requires lowercase, and RDF says [2]:

     Plain literals have a lexical form and optionally a language tag as
     defined by [RFC-3066], normalized to lowercase.

but I think there's still an awkward conflict here.

Several options:

  1.  Change the lexical form for rdf:text to allow mixed case, but have
  value space for language tags be lower case.  Make the canonical
  lexical representation follow the rules BCP-47 states, which I quoted
  above.

  2.  Change the lexical form and the value space to both allow mixed
  case, but have comparison ignore case.  Unfortunately, this means
  "a@en-us" owl:sameAs "a@en-US" would be false (I think), so this seems
  unacceptable.

  3.  Stick with the current design, but include a note explaining why
  we're not allowing people to follow BCP-47, or why it doesn't matter
  (this is internal or something).

  4.  No change.

My apologies if I missed some earlier discussion convering these points.

     -- Sandro


[1] http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/#example8
[2] http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-language-identifier

Received on Monday, 6 April 2009 16:46:03 UTC