Equality in FILTER [#ValueTesting]

I offered at  last weeks telecon to write folloing on from:
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0032.html

(There are a couple of notes where rq23 is wrong or inconsistent - nothing major).

==== Current Situation

"=" and "!=" are syntax in SPARQL that is overloaded.

"=" can be:
numeric-equals
dateTime-equals
RDFterm-equals
        RDFterm-equals is actually three "-equals"
          IRI-equals
          Literal-equals
          BlankNode-equals

There is no string-equals nor boolean-equals.  RDF-term gets the right answer
for the former, the latter is a bug in SPARQL ("1"^^xsd:boolean !=
"true"^^xsd:boolean currently).

== Note 1 : boolean compare is not defined by-value in rq23 currently.
As it is currently, there needs to be entries under "XPath Tests" in 11.3

numeric-equals and dateTime-equals are value tests.

There is a requirement "3.3 Extensible Value Testing"
        http://www.w3.org/TR/rdf-dawg-uc/#r3.3

==== Clarification

A way to get a consistent overall design would be to have that:

* "=" means "sameValueAs" to within the knowledge of the processor.
* "!=" is "notTheSameValueAs" to within the knowledge of the processor.

Similarly for "<" ">" "<=" and ">="


So, if a processor suddenly understands Roman Numerals, then it is possible to 
say:
    2 = "II"^^:RomanNumeral

More usefully, it allows new date formats to be added to processor or
different notations for scientific numbers.

Two RDF terms that are RDF-equals are always sameValueAs so a processor will 
always know that a term is sameValueAs itself.

The XSD types (xsd:integer, xsd:double,. xsd:float, xsd:decimal) are types the 
processor is expected to know about. The subtype rules need to also appliy:

== Note 2: rq23 actually says:
[[
numeric denotes typed literals with datatypes xsd:integer, xsd:decimal,
xsd:float, and xsd:double.
]]
so by that text you can't say "2"^^xsd:byte = "2"^^xsd:integer in SPARQL
because the operator choosen will be RDFterm-equals and the XSD subtype rules
are not applied. This should be clarified.  It does later go on to say 
something that strongly hints at this so this is just editorial.

==== When datatypes  are unknown to the processor

Geoff raised a question about what happens when a processor does not 
understand one or both of the datatypes involved in an "=" or "!=" operation.

Alt 1: If "sameValueAs" is false (the processor does not definitely know that 
they are the same) then both "sameValueAs" and "notSameValueAs" can be false 
at the same time (the cases where the processor has no clue about the datatype 
and the lexical forms are different, or it is two different datatypes, at 
least one of which is unknown to the processor).

That is
    not(sameValueAs(?x,?y)) is different from "notTheSameValueAs(?x,?y)"

Alt 2: Alternatively, it can be a "can't compare error", so

    not(sameValueAs(?x,?y)) is the same as "notTheSameValueAs(?x,?y)"

and sameValueAs and notTheSameValueAs return true/false only when they 
positively know that to be the case.

 Andy

Received on Tuesday, 18 April 2006 09:16:09 UTC