Re: Re: basic-only langMatches test

* Seaborne, Andy <andy.seaborne@hp.com> [2007-08-11 20:29+0100]
>
>
> Eric Prud'hommeaux wrote:
>> * Lee Feigenbaum <lee@thefigtrees.net> [2007-08-10 18:05-0400]
>>> Seaborne, Andy wrote:
>>>> Eric Prud'hommeaux wrote:
>>>>> * Lee Feigenbaum <lee@thefigtrees.net> [2007-08-06 09:01-0400]
>>>>>> ACTION: ericP to write a test showing that langMatches doens't do 
>>>>>> extended matching
>>>>> I checked with i18n-core and it appears that there are no tests that
>>>>> basic matches but extended does not match.
>>>>>   http://www.w3.org/mid/46A62549.7010807@yahoo-inc.com
>>>>>
>>>>> I propose:
>>>>>
>>>>> http://www.w3.org/2001/sw/DataAccess/tests/data-r2/expr-builtin/manifest.ttl
>>>>>   LangMatches-basic
>>>>>
>>>>> Data:
>>>>> @prefix : <http://example.org/#> .
>>>>>
>>>>> :x :p3 "abc"@de .
>>>>> :x :p4 "abc"@de-DE .
>>>>> :x :p5 "abc"@de-Latn-DE .
>>>>>
>>>>> Query:
>>>>> PREFIX : <http://example.org/#>
>>>>>
>>>>> SELECT *
>>>>> { :x ?p ?v . FILTER langMatches(lang(?v), "de-DE") . }
>>>>>
>>>>> Results:
>>>>> ┌────────────────────────┬───────────┐
>>>>> │                       p│          v│
>>>>> ├────────────────────────┼───────────┤
>>>>> │<http://example.org/#p4>│"abc"@de-de│
>>>>> └────────────────────────┴───────────┘
>>>>>
>>>>> and i pass it...
>>>> ARQ does pass this test.
>>> Glitter does also, but...
>>>
>>>> The results do not reflect the syntactic input which has de-DE.  (ARQ 
>>>> actually returns "abc"@de-DE for ?v. -- all matching and equality is 
>>>> done case-insensitively but not by forcing to lower case on data 
>>>> loading.)
>>>> Had it been the JSON results, the results would be different.
>>>> As the point of the test is the FILTER, I suggest changing the results 
>>>> to
>>>> reflect the data and use @de-DE
>>> My test harness thinks I fail it because it does (incorrectly) case 
>>> sensitive comparisons. Could you make this change so that the results 
>>> match the case of the data, Eric?
>> I think it's best if i change the input data. It could be argued from
>> [[
>> Plain literals have a lexical form and optionally a language tag as
>> defined by [RFC-3066], normalized to lowercase.
>> ]]
>> -- http://www.w3.org/TR/rdf-concepts/#section-Graph-Literal
>> that the normalization should go to LC. 
>
> The third note below that text says:
> [[
> Note: The case normalization of language tags is part of the description of 
> the abstract syntax, and consequently the abstract behaviour of RDF 
> applications. It does not constrain an RDF implementation to actually 
> normalize the case. Crucially, the result of comparing two language tags 
> should not be sensitive to the case of the original input.
> ]]
>
> so normalization is not the only implementation choice.

Agreed. I think it's saying that the graph terms
   "da"^^de-DE and "da"^^de-de

are the same term (eeeek). This leads to apparent conundrums with a
series of questions like:

Data:
  <Horst> <said> "da"^^de-DE , "da"^^de-de .

Query:
  SELECT ?o { <Horst> <said> "da"^^de-DE }
Result: TRUE

Query:
  SELECT ?o { <Horst> <said> "da"^^de-de }
Result: TRUE

Query:
  ASK { <Horst> <said> ?o1
        OPTIONAL { <Horst> <said> ?o2
                   FILTER (!sameTerm(?o1, ?o2 )
        }
      }
Result: FALSE

Query:
  SELECT ?o { <Horst> <said> ?o }
Result: "da"^^de-DE or "da"^^de-de depending on whim.


YEAGGHHH!


> > However, I don't want to
>> change, or even help, the world at this point. So let's just go
>> with lower case test data.
>> all "-DE" strings changed to "-de"
>> q-langMatches-de-DE.rq moved to q-langMatches-de-de.rq
>> manifest updated
>>> thanks,
>>> Lee

-- 
-eric

office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
mobile: +1.617.599.3509

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Saturday, 11 August 2007 21:10:11 UTC