W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > October 2002

Re: Case of language identifiers

From: Jos De_Roo <jos.deroo.jd@belgium.agfa.com>
Date: Wed, 23 Oct 2002 12:38:55 +0200
To: "Jeremy Carroll <jjc" <jjc@hpl.hp.com>
Cc: w3c-rdfcore-wg@w3.org, w3c-rdfcore-wg-request@w3.org
Message-ID: <OF1DF832E1.66662607-ONC1256C5B.003975C5-C1256C5B.003A7FBC@agfa.be>

I also prefer/did case normalization on the language tag during parsing

-- ,
Jos De Roo, AGFA http://www.agfa.com/w3c/jdroo/

                    Jeremy Carroll                                                                                      
                    <jjc@hpl.hp.com>          To:     w3c-rdfcore-wg@w3.org                                             
                    Sent by:                  cc:                                                                       
                    w3c-rdfcore-wg-requ       Subject:     Case of language identifiers                                 
                    2002-10-23 11:27 AM                                                                                 

Now we have the notion of value of a literal reasonably clear, a little
issuette becomes clearer.

We have previously agreed that




are the same.

(I think in Cannes).

We can rephase that in model theorteic terms as:

<rdf:Description xml:lang="en">

<rdf:Description xml:lang="EN">

entail one another.

The question that comes to mind is when do we do the case normalization on
language tag.
Just to be inconvenient, the convention for language tags is that the first

component is lower case, the second upper case: e.g. en-US

Possible answers are:

1: ASAP, during parsing, the abstact syntax is in terms of lower case

2: In the equality function in the abstract syntax, before datatyping and
model theory.
This is the current position. It has the defect that datatyping and the
theory should then be expressed as operations over equivalence classes, in
some way or other.

3: During the datatype mapping for String and XML Literals
The abstract syntax is then defined in terms of any case identifiers.
But the case is normalized before we get to a value.
This is subtly different in that for unknown datatypes we don't know that
are insensitive to the case of the language identifier.
i.e. <a:datatype>"foo"-en and <a:datatype>"foo"-EN
might be different; it is just that that are the same for all the ones we

My preference is 1 which would be a change from what we have previously

Received on Wednesday, 23 October 2002 06:39:32 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:24:16 UTC