Re: Operational definition of SPARQL algebra

Since I missed the call today, Kendall asked me to respond to
Andy's thread begun in
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006OctDec/0080.html

Unfortunately I cannot respond in detail at this time, but I am
highly supportive of this direction.  I don't particularly care what names
the operators are given, in particular either LeftJoin or Optional or Opt
looks acceptable. 

I particularly like Andy's proposal that the operators manipulate
collections of solutions.  I think that from the spefication standpoint,
it is better to focus on the collection of solutions, rather than the
individual solution.  I think the language "S is a solution of P if ..."
is useful in terms of motivating the definition of the operations,
but I have found it hard to make this style as robust as I would like.
So I think the formal definitions should be couched in terms of
a collection whose members are called solutions.  That is, the
collection is the primary entity of the algebra, not the solution.

I further think that the collection should be a multiset rather than a set,
to cater for duplicates from UNION.  This means that the entire
algebra needs to talk about how many duplicates come out of each
operation, depending on how many duplicates go in as operands.
Or sequence would do as the collection type, which would tie in
with the final stage where the sequence is ordered, etc.

Finally, I hope that we can increase the scope of blank node identifiers
to be larger than just a basic graph pattern or a group graph pattern.
The issue is that with incremental query development, the user may
first construct and validate a query with blank node identifiers, and
then evolve that query using OPTIONAL or other constructs that
insert curly braces. 

My research in my paper "Constructive mapping semantics" posted in
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/0144.htm
suggested that this is feasible using mapping semantics.  However,
using entailment, I do not believe that the scope of a blank node identifier
can extend beyond a GRAPH node of a tree representation of a query. 
In terms of block structured query
languages, this means that a GRAPH pattern creates a new lexical context
for blank node identifiers.  See especially section 3 of that paper.

I was in the course of fleshing out that section in more detail, when
I noticed a defect in my work to date, that I have not considered
the possibility of UNION or GRAPH nodes nested inside the
second operand of OPTIONAL.  I believe that this defect can be
overcome, but I have not finished working on that.

Fred

Received on Tuesday, 24 October 2006 17:56:57 UTC