Re: SPARQL and string literal matching woes - spec inconclusive - try 2

> Second case is where the SPARQL query is:
>
>  SELECT ?x where { ?x ns:p "value"^^xsd:string }
>
> In this case too, we get implementations that return just ns:b and
> implementations that return ns:a and ns:b.

This is exactly the same as the prior case; it's controlled by the  
entailment regime (or bugs in the implementation!).

> The third case is a bit more complex SPARQL query:
>
>  SELECT ?x where { ?x ns:p ?y FILTER (?y = "value") }
>
> This is somewhat trickier. The SPARQL specification defines that
> operator fn:compare is used to match between plain literal pairs and
> also between xsd:string pairs. I am not sure if the specification
> defines how a string literal in a filter clause should be interpreted
> - that is, if "value" is actually a plain literal or just some
> ephemeral string type.

It's a simple literal.

This query on these data involves four comparisons:

ns:a  |  simple literal = simple literal
ns:b  |  xsd:string     = simple-literal
ns:c  |  typed literal  = simple-literal   # changes if we know that  
dt:datatype is numeric
ns:d  |  literal        = simple-literal

which, *in an unextended implementation*, means we call

op:numeric-equal(fn:compare(x, y, 0)
RDFterm-equal(x, y)
RDFterm-equal(x, y)
RDFterm-equal(x, y)

The result of the comparison is defined by those functions.

Note that:

1. implementations can add rows to the operator mapping table -- e.g.,  
a row for xsd:string to simple-literal. This can result in additional  
answers to your query. Many implementations do this (in fact, the  
SPARQL test suite expects some extensions!).

2. under D-entailment, new triples will actually be considered,  
producing more output rows (which might be redundant).


> The fourth case is again similar to the one before:
>
>  SELECT ?x where { ?x ns:p ?y FILTER (?y = "value"^^xsd:string)

Same thing.

> The fifth case uses a yet another new operator:
>
>  SELECT ?x where { ?x ns:p ?y FILTER (sameTerm(?y, "value")) }
>
> This case seems to be a clear cut decision in my opinion.

It's not an operator, which is why it's clear cut. The implementation  
of sameTerm is not up for debate, so this produces well-defined  
results (as would calling RDFterm-equal directly, rather than hoping  
that = would yield such a call).

> However, even in this case, I found some implementations which return
> both ns:a and ns:b instead of just ns:a. These I would personally
> classify as non-conforming implementations.

I would agree.

> The sixth case is again a variation of the one before:
>
>  SELECT ?x where { ?x ns:p ?y FILTER (sameTerm(?y,  
> "value"^^xsd:string)) }
>
> In this case, there should be no question as to whether
> "value"^^xsd:string is a typed literal or not.

The SPARQL grammar is clear that "foo" is a simple literal.

> Even still, some implementations return both ns:a and ns:b instead of
> just ns:b in this case.

Still a bug.

HTH,

-R

Received on Saturday, 5 July 2008 18:29:33 UTC