- From: Lee Feigenbaum <feigenbl@us.ibm.com>
- Date: Mon, 18 Jun 2007 23:42:38 -0400
- To: "Seaborne, Andy" <andy.seaborne@hp.com>
- Cc: public-rdf-dawg@w3.org
"Seaborne, Andy" <andy.seaborne@hp.com> wrote on 06/18/2007 05:18:24 PM: > Lee Feigenbaum wrote (Agenda for 19/June/2007) > ... > > 3. Test progress > ... > > I'd also like to look at the language tag case sensitivity tests that we > > didn't approve last > > week: > > > > data-r2/open-world/manifest-lang-case-sensitivity.ttl > > > > Given time, we'll find other tests to work through and approve. > > ARQ fails the first test "lang-case-sensitivity" because lang tags are > compared in filters in a case insensitive way. (We're talking about http://www.w3.org/2001/sw/DataAccess/tests/data-r2/open-world/lang-case-sensitivity-eq.rq right?) I agree; I think the test is incorrect: there should be 4 solutions. We've discussed this before, but for the record (so we have a URI), here's my reading of the specs: We're testing plain literals with language tags with =. So we look through the table in 11.3. The = entry for simple literals doesn't apply since the literals have language tags. So we drop down to the = entry for RDF terms, which defers to RDFTerm-equal ( http://www.w3.org/TR/rdf-sparql-query/#func-RDFterm-equal ). RDFterm-equal passes the buck to 6.5.1 Literal Equality from RDF Concepts ( http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality ), where we find, among other things, "The language tags, if any, compare equal." Looking directly above 6.5.1 (in the intro to 6.5 RDF Literals), we see that "Plain literals have a lexical form and optionally a language tag as defined by [RFC-3066], normalized to lowercase." And finally, most to the point: """ Note: The case normalization of language tags is part of the description of the abstract syntax, and consequently the abstract behaviour of RDF applications. It does not constrain an RDF implementation to actually normalize the case. Crucially, the result of comparing two language tags should not be sensitive to the case of the original input. """ ...from which I reach my conclusion that the objects of the two triples in http://www.w3.org/2001/sw/DataAccess/tests/data-r2/open-world/lang-case-sensitivity.ttl are RDFterm-equal. So the query should give a result for each pair of bindings from the two triples - i.e., 4 results. lang-case-insensitive-eq seems to be the correct version of the test. > > I think the last test "lang-case-insensitive-ne" is not making the point the > naming suggests. > > lang-case-insensitive-ne.srx > and > lang-case-sensitive-ne.srx > > are the same (no rows). The tests form the cross product of the triples and > then filter: > > SELECT * > { > ?x1 :p ?v1 . > ?x2 :p ?v2 . > FILTER ( ?v1 != ?v2 ) > } > > > I'd expect lang-case-insensitive-ne.srx to record the cases of > 'xyz'@en != 'xyz'@EN and 'xyz'@EN!= 'xyz'@en > > ----------------------------------- > | x1 | v1 | x2 | v2 | > =================================== > | :x3 | "xyz"@EN | :x2 | "xyz"@en | > | :x2 | "xyz"@en | :x3 | "xyz"@EN | > ----------------------------------- Hmm? I'd expect that to be the case of lang-case-*sensitive* - when using case insensitivity, all of the pairs compare equal, so the result should be the empty set. In any case, as far as the spec goes and approved tests go, I think that we should be approving both of the *insensitive tests. Lee > Andy > > > -- > Hewlett-Packard Limited > Registered Office: Cain Road, Bracknell, Berks RG12 1HN > Registered No: 690597 England
Received on Tuesday, 19 June 2007 03:42:51 UTC