Re: !=, NOT IN and type errors

On 27/03/2011 11:57, Eric Prud'hommeaux wrote:
> * Jeen Broekstra<jeen.broekstra@gmail.com>  [2011-03-27 10:59+1300]

[snip]

>> Now, we apply RDFterm-equal to our comparison. Both operands are
>> literals. The definition says that two literals are RDF-term-equal
>> if they are "equivalent literals" according to def 6.5.1 in RDF
>> concepts. Since our operands have different lexical values as well
>> as different datatypes, literal equality fails. The definition of
>> RDFterm-equal then says: "[it] produces a type error when both
>> operands are literal but are not the same RDF term". Since in our
>> case both are literal but are not the same term, it results in a
>> type error. So RDFterm-equal results in a type error, and since
>> applying fn:not on a type error results in a type error,
>> "foo"^^xsd:string != "4"^^xsd:integer evaluates to a type error.
>> This is what I originally thought happened, and what I thought was
>> undesirable.
>
> Ahh, this is what I believe to have been the design goal. The problem
> is exemplified by
>    "iiii"^^my:romanNumeral = "iv"^^my:romanNumeral

Yes, quite, and I agree that this really the only way to handle 
non-native datatypes. It's the fact that it also works this way for 
native datatypes that annoys me. It leads to different SPARQL 
implementations giving different results for very basic comparisons.

> The answer there is probably a non-controversial "beats me". The
> answer to "foo"^^xsd:string != "4"^^xsd:integer in unextended
> implementations is also "beats me", but that doesn't keep you from
> adding operators for e.g. xsd:string and xsd:integer for which the
> answer to "=" is false. (You probably want a switch to enable you
> to run the unextended tests, which have no way of distinnguishing
> between sagely extension and wanton impudance.)

True. In fact I have already implemented this in Sesame this way (except 
for the switch bit). I am just of the opinion that it is unwise to leave 
this decision up to indivual implementations, at least for the native 
datatypes.

I would be much happier if the datatypes that SPARQL claims to support 
(that is, the ones mentioned in "Operand Data types") are indeed fully 
supported, i.e. it is fixed in the spec that string and int (and the 
other supported types) are pairwise distinct.

> We could take the bold step of declaring the 19 native types pairwise
> distinct by either adding a bazillion rows to the operator mapping

Surely it doesn't need that many? The numeric types are already grouped 
under the nomer "numeric" (you could do the same for dateTime, date, 
etc, by the way, e.g. "calendar"). I'd say this would be enough:

Op type(A)  type(B)  maps to
-----------------------------------------------
A != B string   numeric  TRUE
A != B  string   boolean  TRUE
A != B  string   calendar TRUE
A != B numeric  calendar TRUE
A != B numeric  boolean  TRUE
A != B calendar  boolean  TRUE

Somewhat clunky, perhaps, but still managable I'd say.

> or
> expanding the definition of RDFterm-equal like so:
> [[
>    Returns TRUE if term1 and term2 are the same RDF term as defined in
>    Resource Description Framework (RDF): Concepts and Abstract Syntax
>    [CONCEPTS];
> + returns FALSE if the arguments are different types but both are
> + listed in ยง17.1 Operand Data Types;

This is not quite enough as it does not check that the operands are 
_valid_ typed literals. You'd still want "xyz"^^xsd:integer != 
"foo"^^xsd:string to raise a type error, I think.

You could add that condition too I guess, but it all gets a bit 
convoluted - perhaps adding some rows to the mapping table is easier here?

Cheers,

Jeen

Received on Monday, 28 March 2011 21:14:33 UTC