ARQ implementation of LET

In ARQ "LET (?v := expr)" should be read informally as "in any solution, 
?v must be the value of the expression".  The rules for execution are 
based on this and allow for ?v to already be some value, or some value 
later in the query.  Read this way, it's not a fixed assignment.  It 
does not replace any already-set value with another.


The execution rules are for a simple entailment graph:

1/ If the variable is unbound, and the expression evaluates, the 
variable is bound to the value.
2/ If the variable is bound to the same value as the expression 
evaluates, nothing happens and the query continues.
3/ If the variable is bound to a different value as the expression 
evaluates, an error occurs and the current solution will be excluded 
from the results.
4/ If the expression does not evaluate (e.g. unbound variable in the 
expression), no assignment occurs and the query continues.

Rule 1 is the case of simply adding a column to the results - the direct 
way to meet the criteria of having the required value.

Rules 2 & 3 deal with the case of the binding already being defined; 
rule 3 stops the replacing of one value by another.

Rule 4 is the error case.

Note that "same value" here means the same as applies to graph pattern 
matching, for whatever the capabilities of the engine are, not to FILTER 
expressions, so for a simple entailment graph, that means "same term". 
For a engine providing understanding of numbers, it means "=".

The rules mean that there is some order independence, and hence it's 
like (as I understand it) what Glitter does by considering all LET's to 
happen logically at the end of a BGP, after pattern matching and before 
FILTERs.  This maximises the variables in scope.

Assuming :p and :q always have a numeric value and execution is naively 
in the order the pattern is written:

{ ?s :p ?o . LET (?o1 = ?o +1) . ?s :q ?o1 }

{ ?s :p ?o . ?s :q ?o1 . LET (?o1 = ?o+1) }

have the same effect.  Because of rules 2&3 and because ?o1 is used in a 
pattern as well, it is also the same effect as:

{ ?s :p ?o . ?s :q ?o1 . FILTER (sameTerm(?o1,?o+1) }

where setting ?o1 is driven from the BGP matching.  This happens to give 
optimizers some opportunities.  Related: several system optimize {?s :p 
?x FILTER(?x=<uri>) } by executing {?s :p <uri> } and adding ?x = <uri> 
to the results.


We need to decide the best choice for the feature in SPARQL 1.1 - I'm 
not suggesting that the rules above are necessarily the best or only choice.

I'm not sure (4) is the best choice - it may be more consistent to be 
like a FILTER and eliminate the row like FILTER(error) does.  I chose 
the way it so that a mistake only caused an empty cell, not the loss of 
a row, which makes debugging easier.


The most common use case that I've seen is to introduce a new variable 
and assign a value to it, as part of the overall results.  In that way, 
it's like SELECT expressions but less cumbersome.  Users seemto find it 
natural to say "and also put in a column for ?x where the value ...". 
In this case, the variable introduced isn't used again and a syntactic 
rule of the variable must not have been mentioned yet, makes it exactly 
like SELECT expressions (which do have that rule as a static syntax 
condition, not on a per-row basis).

Another use case is including calculating an intermediate value that is 
used several times elsewhere in the query, maybe a FILTER twice
(a simple matter of it being easier to write the expression once).


 Andy

PS As far as I can see, SET is the word more generally used in SQL, not 
LET.  It's used for several things:

Server settings:
http://www.postgresql.org/docs/8.1/interactive/sql-set.html
http://dev.mysql.com/doc/refman/5.1/en/set-option.html

Stored procedure/user variables:
http://dev.mysql.com/doc/refman/5.1/en/set-statement.html
http://msdn.microsoft.com/en-us/library/aa259193%28SQL.80%29.aspx

Received on Thursday, 8 July 2010 14:11:23 UTC