Re: Question concerning typed literals in SPARQL

On Wed, Nov 30, 2005 at 03:58:17PM +0000, Jeremy Carroll wrote:
> 
> 
> This question concerns your document:
> http://www.w3.org/TR/2005/WD-rdf-sparql-query-20051123/
> 
> In SWBPD WG, we have been discussing the semantics of typed literals.
> 
> In particular, we are trying to decide between the three possibilities 
> outlined in:
> 
> http://www.w3.org/TR/2005/WD-swbp-xsch-datatypes-20050427/#sec-values
> 
> The third of these (True Values) has not received any support.
> 
> The second solution, based around XPath eq, is motivated to try and give 
> a smoother experience to end users who may find data for which the 
> choices between say xsd:double and xsd:decimal have not been consistent.
> 
> Advocates of the first solution (Primitive Equality), which treats 
> xsd:decimal and xsd:double as disjoint, have argued that the same end 
> user functionality can be achieved by combining the first solution with 
> SPARQL.
> 
> The purpose of this e-mail is to confirm that line of argument with you.
> 
> 
> In this first solution (Primitive Equality) equality of typed literals 
> is determined by comparing literals using their primitive base type, and 
> treating all primitive base types as different.
> In this
> "1.3"^^xsd:float
> "1.3"^^xsd:double
> "1.3"^^xsd:decimal
> "1"^^xsd:float
> "1"^^xsd:double
> "1"^^xsd:decimal
> all have different values.
> 
> My understanding is that SPARQL does not specify whether the store being 
> queried is required or not to treat two literals with the same value but 
> different syntactic form as the same or different.
> If we have two stores A and B where A compares literals syntactically, 
> but B compares literal by value, and the value comparisons are done with 
> the Primitive Equality semantics described as above, then my 
> understanding is that the following results would hold.

For graph matching, B compares literal by value, therefor, B's
database contains all of the extra triples stemming from the known
equivilences. SPARQL's graph matching is matched against this larger
set of triples.

For FILTERS, the semantics are entirely defined in SPARQL, so no extra
clever database will give you extra clever FILTERS.

> If the following triples are loaded into both A and B
> 
> <eg:decimal> <eg:p> "1.3"^^xsd:decimal .
> <eg:float> <eg:p> "1.3"^^xsd:float .
> <eg:double> <eg:p> "1.3"^^xsd:double .
> <eg:decimal2> <eg:p> "1.300"^^xsd:decimal .
> 
> Then:
> 
> SELECT  ?s, ?p
> WHERE   { ?s, ?p, 1.3 } .
> 
> would match
> 
> <eg:decimal> <eg:p> "1.3"^^xsd:decimal .
> in A
> 
> and
> 
> 
> <eg:decimal> <eg:p> "1.3"^^xsd:decimal .
> <eg:decimal2> <eg:p> "1.300"^^xsd:decimal .
> in B

Agreed

> Whereas:
> 
> SELECT  ?s, ?p
> WHERE   { ?s, ?p, ?o .
>            FILTER (?o = 1.3) . } .
> 
> would match all four triples for both A and B, since = is interpreted as 
> in fn:numeric-equals() and type promotions apply to give equality in all 
> cases.
> 
> However,
> 
> SELECT  ?s, ?p
> WHERE   { ?s, ?p, ?o .
>            FILTER (?o = 1.3e0) . } .
> 
> would match the following triples
> 
> <eg:decimal> <eg:p> "1.3"^^xsd:decimal .
> <eg:double> <eg:p> "1.3"^^xsd:double .
> <eg:decimal2> <eg:p> "1.300"^^xsd:decimal .

@FLOAT@

> 
> because the numeric rules would cast "1.3"^^xsd:float to the nearest 
> double, which is not "1.3"^^xsd:double.

I've been scratching my head about this for a while.
Is this a last-bit-might-be-wrong problem, or something more related
to type promotion voodoo?

> If an application wanted to explicitly do the equality with floating 
> point precision (rather than double precision), I understand the 
> following query could be used:
> 
> SELECT  ?s, ?p
> WHERE   { ?s, ?p, ?o .
>            FILTER (xsd:float(?o) = xsd:float(1.3) ) . } .
> 
> using explicit casts.
> This would return all four triples.
> 
> Please indicate whether these examples are correct.

I don't have a rigorous enforcement of numeric types in my
implementation (I rely on perl to do all the numeric comparisons).
However, all this seems consistent except for my not understanding
the lack of "1.3"^^xsd:float at the @FLOAT@ marker (above).

> thanks
> 
> Jeremy Carroll
> 
> PS I am arguing in the SWBPD WG, that since SPARQL adequately addresses 
> the needs to make looser comparisons of the sorts above, where float and 
> decimal and doubles are treated equivalently, then the next version of
> 
> 
> http://www.w3.org/TR/2005/WD-swbp-xsch-datatypes-20050427/
> 
> should be presenting primitive equality as the preferred semantics, and 
> any further equivalences required by an application to be ones for the 
> application to determine, for example, by use of queries such as those 
> given here.
> 
> PPS Note I am pleased to see the greater clarity in your latest WD 
> concerning the type of '1.3' in SPARQL. I found it hard to tell in the 
> earlier draft which datatype was intended. Personally I have no opinion 
> as to which datatype is better, but I support the "in progress" change 
> highlighted at the beginning of section 3 from an editorial point of view.

Andy, take a bow.
-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +81.90.6533.3882

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Monday, 5 December 2005 10:48:39 UTC