Re: lang-case-sensitivity

Lee Feigenbaum wrote:
> "Seaborne, Andy" <andy.seaborne@hp.com> wrote on 06/18/2007 05:18:24 PM:
> 
>> Lee Feigenbaum wrote (Agenda for 19/June/2007)
>> ...
>>> 3. Test progress
>> ...
>>> I'd also like to look at the language tag case sensitivity tests that 
> we 
>>> didn't approve last
>>> week:
>>>
>>> data-r2/open-world/manifest-lang-case-sensitivity.ttl
>>>
>>> Given time, we'll find other tests to work through and approve.
>> ARQ fails the first test "lang-case-sensitivity" because lang tags are 
>> compared in filters in a case insensitive way.
> 
> (We're talking about 
> http://www.w3.org/2001/sw/DataAccess/tests/data-r2/open-world/lang-case-sensitivity-eq.rq 
> right?)

Yes.

> 
> I agree; I think the test is incorrect: there should be 4 solutions. 
> 
> We've discussed this before, but for the record (so we have a URI), here's 
> my reading of the specs:
> 
> We're testing plain literals with language tags with =. So we look through 
> the table in 11.3. The = entry for simple literals doesn't apply since the 
> literals have language tags. So we drop down to the = entry for RDF terms, 
> which defers to RDFTerm-equal ( 
> http://www.w3.org/TR/rdf-sparql-query/#func-RDFterm-equal ). RDFterm-equal 
> passes the buck to 6.5.1 Literal Equality from RDF Concepts ( 
> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality ), where we 
> find, among other things, "The language tags, if any, compare equal." 
> Looking directly above 6.5.1 (in the intro to 6.5 RDF Literals), we see 
> that "Plain literals have a lexical form and optionally a language tag as 
> defined by [RFC-3066], normalized to lowercase." And finally, most to the 
> point:
> 
> """
> Note: The case normalization of language tags is part of the description 
> of the abstract syntax, and consequently the abstract behaviour of RDF 
> applications. It does not constrain an RDF implementation to actually 
> normalize the case. Crucially, the result of comparing two language tags 
> should not be sensitive to the case of the original input.
> """
> 
> ...from which I reach my conclusion that the objects of the two triples in 
> http://www.w3.org/2001/sw/DataAccess/tests/data-r2/open-world/lang-case-sensitivity.ttl 
> are RDFterm-equal.
> 
> So the query should give a result for each pair of bindings from the two 
> triples - i.e., 4 results.
> 
> lang-case-insensitive-eq seems to be the correct version of the test.
> 
>> I think the last test "lang-case-insensitive-ne" is not making the point 
> the 
>> naming suggests.
>>
>> lang-case-insensitive-ne.srx
>>    and
>> lang-case-sensitive-ne.srx
>>
>> are the same (no rows).  The tests form the cross product of the triples 
> and 
>> then filter:
>>
>> SELECT *
>> {
>>      ?x1 :p ?v1 .
>>      ?x2 :p ?v2 .
>>      FILTER ( ?v1 != ?v2 )
>> }
>>
>>
>> I'd expect lang-case-insensitive-ne.srx to record the cases of
>> 'xyz'@en != 'xyz'@EN and 'xyz'@EN!= 'xyz'@en
>>
>> -----------------------------------
>> | x1  | v1       | x2  | v2       |
>> ===================================
>> | :x3 | "xyz"@EN | :x2 | "xyz"@en |
>> | :x2 | "xyz"@en | :x3 | "xyz"@EN |
>> -----------------------------------
> 
> Hmm? I'd expect that to be the case of lang-case-*sensitive* - when using 
> case insensitivity, all of the pairs compare equal, so the result should 
> be the empty set.

You're right.  Test 3.

The point is that tests 3 and 4 have the same query, data and results because
   lang-case-insensitive-ne.srx == lang-case-sensitive-ne.srx

> 
> In any case, as far as the spec goes and approved tests go, I think that 
> we should be approving both of the *insensitive tests.

i.e. test 2 and 4.

	Andy

> 
> Lee
> 
>>    Andy
>>
>>
>> -- 
>>   Hewlett-Packard Limited
>>   Registered Office: Cain Road, Bracknell, Berks RG12 1HN
>>   Registered No: 690597 England
> 

-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Tuesday, 19 June 2007 08:34:16 UTC