Design of SELECT expressions

A while back, Eric and I had a long chat on IRC about the design of SELECT expressions and how it affects the algebra.

We came down to two possibilities:

A do-everything form of "project" (let's call it "select") that took variable-expressions and plain variables and a version that kept "project" as a just choosing variables from solutions and have a separate operation which is called "extend" in the working draft that deals with the adding of variables to solutions.  You might think of this "project" as a vertical slice of a (ragged) table and "extend" as adding columns to a table.

We couldn’t find an advantage for the do-everything form and it has a disadvantage that it is harder to work with for optimization as it is messy and sub cases need to be considered. The working draft has the project-extend form outlined.

The translation to the algebra is to find each AS and create an "extend" for it and put the named variable in to the project. 

* Example 1

SELECT (?a+?b AS ?c) WHERE { ... }

becomes in the project-extend form:

(project ?c 
   (extend (?c <- ?a + ?b)
      ( ... )
    ))

* Example 2

SELECT ?a , ?b , (?a+?b AS ?c) , (?a*?b AS ?d)  WHERE { ... }

becomes 

(project ?a ?b ?c ?d
   (extend (?d <- ?a * ?b)
      (extend (?c <- ?a + ?b)
        ( ... )
    )))



This also means we can define the effect of second use of an AS variable in a SELECT clause if we want to with a left-to-right reading:

SELECT (2+?z AS ?a) , (?y + ?x AS ?z)

This can be done with nesting as:

SELECT ?z , (2+?z AS ?a)  
{
   SELECT (?y + ?x AS ?z) {
     ...pattern...
  }
}

But we can just use the algebra translation to give it meaning.

Note that the outer SELECT needs to project ?z as well as add the AS because SELECT always projects even when it is extending (which is one reason it's messy to work with the do-everything "project").  Easy mistake for applications writers to make.

Aside: how about allowing * in SELECT with AS
   SELECT * , (2+?z AS ?a) , (?y + ?x AS ?z)


Which brings me to assignment and the recent comments list discussion.

  ..pattern1..
  LET(?c := ?a+?b)

would be 

 (extend (?c <- ?a + ?b)
   (..pattern1..)

which makes it shorthand for SELECT * , (expr AS ?x) or project-all SELECT

There are now at least four separate implementations that do assignment (ARQ, Glitter, Mulgara and Redland).  The amount of work to add assignment to the language is small as I hope the discussion above shows: it just happens in the syntax to algebra translation and uses operators that exist anyway. I can do the necessary editing work together with SELECT expressions.  Testing is also a small burden given that SELECT expressions need a test cases.

 Andy

Received on Friday, 30 October 2009 12:21:54 UTC