- From: Lee Feigenbaum <lee@thefigtrees.net>
- Date: Thu, 08 Jul 2010 11:00:09 -0400
- To: Andy Seaborne <andy.seaborne@talis.com>
- CC: SPARQL Working Group <public-rdf-dawg@w3.org>
Thanks, Andy. This makes it very easy for me to summarize what Glitter does. Please see below. On 7/8/2010 10:10 AM, Andy Seaborne wrote: > In ARQ "LET (?v := expr)" should be read informally as "in any solution, > ?v must be the value of the expression". The rules for execution are > based on this and allow for ?v to already be some value, or some value > later in the query. Read this way, it's not a fixed assignment. It does > not replace any already-set value with another. > > > The execution rules are for a simple entailment graph: > > 1/ If the variable is unbound, and the expression evaluates, the > variable is bound to the value. > 2/ If the variable is bound to the same value as the expression > evaluates, nothing happens and the query continues. > 3/ If the variable is bound to a different value as the expression > evaluates, an error occurs and the current solution will be excluded > from the results. > 4/ If the expression does not evaluate (e.g. unbound variable in the > expression), no assignment occurs and the query continues. Glitter shares rules 1-3. Regarding rule 4, an error in the LET expression currently causes an error in the whole query, but this is not by design. I would prefer a design that either shares ARQ's rule 4 or that discards the solution. The way I think about this myself (but believe it is the same is): * Solve a group - this gives me a solution set * For each solution, S and assignment (V := E), evaluate E with S as the environment, and then join S with with (V -> eval(E, S)). > Rule 1 is the case of simply adding a column to the results - the direct > way to meet the criteria of having the required value. > > Rules 2 & 3 deal with the case of the binding already being defined; > rule 3 stops the replacing of one value by another. > > Rule 4 is the error case. > > Note that "same value" here means the same as applies to graph pattern > matching, for whatever the capabilities of the engine are, not to FILTER > expressions, so for a simple entailment graph, that means "same term". > For a engine providing understanding of numbers, it means "=". > > The rules mean that there is some order independence, and hence it's > like (as I understand it) what Glitter does by considering all LET's to > happen logically at the end of a BGP, after pattern matching and before > FILTERs. This maximises the variables in scope. > > Assuming :p and :q always have a numeric value and execution is naively > in the order the pattern is written: > > { ?s :p ?o . LET (?o1 = ?o +1) . ?s :q ?o1 } > > { ?s :p ?o . ?s :q ?o1 . LET (?o1 = ?o+1) } > > have the same effect. Because of rules 2&3 and because ?o1 is used in a > pattern as well, it is also the same effect as: > > { ?s :p ?o . ?s :q ?o1 . FILTER (sameTerm(?o1,?o+1) } This is all true for me also. > where setting ?o1 is driven from the BGP matching. This happens to give > optimizers some opportunities. Related: several system optimize {?s :p > ?x FILTER(?x=<uri>) } by executing {?s :p <uri> } and adding ?x = <uri> > to the results. > > > We need to decide the best choice for the feature in SPARQL 1.1 - I'm > not suggesting that the rules above are necessarily the best or only > choice. > > I'm not sure (4) is the best choice - it may be more consistent to be > like a FILTER and eliminate the row like FILTER(error) does. I chose the > way it so that a mistake only caused an empty cell, not the loss of a > row, which makes debugging easier. > > > The most common use case that I've seen is to introduce a new variable > and assign a value to it, as part of the overall results. In that way, > it's like SELECT expressions but less cumbersome. Users seemto find it > natural to say "and also put in a column for ?x where the value ...". In > this case, the variable introduced isn't used again and a syntactic rule > of the variable must not have been mentioned yet, makes it exactly like > SELECT expressions (which do have that rule as a static syntax > condition, not on a per-row basis). > > Another use case is including calculating an intermediate value that is > used several times elsewhere in the query, maybe a FILTER twice > (a simple matter of it being easier to write the expression once). For me, a common use case is discriminated unions: { ... { ... ?x ... LET (?branch := "foo") } UNION { ... ?x ... LET (?branch := "bar") } UNION { ... ?x ... LET (?branch := "baz") } ... } Lee > > Andy > > PS As far as I can see, SET is the word more generally used in SQL, not > LET. It's used for several things: > > Server settings: > http://www.postgresql.org/docs/8.1/interactive/sql-set.html > http://dev.mysql.com/doc/refman/5.1/en/set-option.html > > Stored procedure/user variables: > http://dev.mysql.com/doc/refman/5.1/en/set-statement.html > http://msdn.microsoft.com/en-us/library/aa259193%28SQL.80%29.aspx > > >
Received on Thursday, 8 July 2010 15:00:46 UTC