Re: [TF-LIB] IN operator from Steve Harris on 2010-02-07 (public-rdf-dawg@w3.org from January to March 2010)

From: Steve Harris <steve.harris@garlik.com>
Date: Sun, 7 Feb 2010 23:25:00 +0000
To: Andy Seaborne <andy.seaborne@talis.com>
Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-Id: <43128C32-2E34-46DB-88DE-2AED05184097@garlik.com>

I have a preference for it being sameTerm, rather than =.

Otherwise "1.0"^^xsd:decimal IN ("1"^^xsd:integer,  
"1.0e0"^^xsd:double) would be true, which seems counterintuitive to  
me. i.e.

IN ==>
sameTerm(expr, expr1) || sameTerm(expr, expr2) || ...

NOT IN ==>
!(sameTerm(expr, expr1) || sameTerm(expr, expr2) || ... )

It's always possible to cast the LHS is you want a more lax IN  
predicate - xsd:integer(?x) IN (1, 2, 3).

In terms of semantics I prefer the first option, I think.

- Steve

On 7 Feb 2010, at 20:42, Andy Seaborne wrote:

> Proposal for the IN operator.
>
>
> IN is a operator with the same precedence as EQ etc.
>
> Syntax:
>    expr IN ( expr1, expr2, ....)
>    expr NOT IN ( expr1, expr2, ....)
>
> e.g.
>
>   FILTER ( ?x IN ('a', 'b', 'c') )
>   FILTER ( ?x NOT IN ('a', 'b', 'c') )
>
> (SQL has NOT IN and !(expr IN ( expr1, expr2, ....)) is clunky)
>
> Semantics:
>
> Evaluation is equivalent to writing out in long form:
>
> IN ==>
> expr =  expr1 || expr = expr2 || ...
>
> NOT IN ==>
> expr != expr1 && expr != expr2 && ...
>
> That makes IN a special form like || and && already are.  The  
> arguments are not all evaluated first, then the operator itself  
> called.  If the result can definitely determined
>
>  8 IN (1, 2, 3) is false
>  9 IN (1, 2, 1/0) is error
>
>  1 IN (1, 1/0, 3) is true
>  1 IN (3, 1/0, 1) is true
>
> because in SPARQL 1.0:
> 1 = 3 || 1 = 1/0 || 1 = 1
> is true
>
>  8 NOT IN (1, 2, 3) is true
>  9 NOT IN (1, 2, 1/0) is error
>
>  1 NOT IN (1, 1/0, 3) is false
>  1 NOT IN (3, 1/0, 1) is false
>
> because in SPARQL 1.0:
> 1 != 3 && 1 != 1/0 && 1 != 1
> is false
>
> The outcome of evaluation is independent of argument order.
>
> Alternatives:
>
> Alt 1: Strict function: the arguments are all evaluated first so any  
> error means the expression is an error.
>
>  1 IN (3, 1/0, 1) is error
>  1 IN (1, 1/0, 3) is error
>
> but it does mean all arguments must be evaluated even if not needed.
>
> Alt 2: Left-right evaluation:
> If an error is encountered, the expression is an error
> but the evaluation stops if true for IN or false for NOT IN is  
> encountered.  The order of the arguments now matters:
>
>  1 IN (3, 1/0, 1) is error
>  1 IN (1, 1/0, 3) is true
>
> and it isn't a rewrite to || and = anymore.
>
> I prefer the rewrite to "=" and "||" version.
>
> 	Andy
>

-- 
Steve Harris, Garlik Limited
2 Sheen Road, Richmond, TW9 1AE, UK
+44 20 8973 2465  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10  
9AD

Received on Sunday, 7 February 2010 23:25:31 UTC