Re: Semantics of SPARQL (Perez, Arenas, and Gutierrez) from Lee Feigenbaum on 2006-10-30 (public-rdf-dawg@w3.org from October to December 2006)

From: Lee Feigenbaum <feigenbl@us.ibm.com>
Date: Mon, 30 Oct 2006 01:26:08 -0500
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <OF2DB80CB8.AD4C8CF6-ON85257217.00223963-85257217.00235923@us.ibm.com>
Andy Seaborne wrote on 10/27/2006 11:15:19 AM:

> Lee Feigenbaum wrote:
> > 
> > In the hopes that we might be able to base or at least derive many of 
> > our definitions from the work done in the latest paper from Perez, 
> > Arenas, and Gutierrez, I've gone through the paper and scribbled down 
my 
> > reactions here.
> 
> Thanks for doing that.
> 
> I'm leaning towards the multiset + blank node variation of their 
> formalization.

I would be pleased with that.

...

> > Section 3 - Syntax and Semantics of General Graph Patterns
> > 
> > This section presents the algebraic operations that create new sets of 

> > mappings (solutions) from operand sets of mappings.
> > 
> > In Sec. 3.1, filters are modeled as a binary operator that apply to an 

> > arbitrary graph pattern and a constraint. Our recent discussions apply 
a 
> > FILTER to a group graph pattern. However, a group graph pattern has no 

> > existence within this paper's way of viewing the world. (In this 
paper, 
> > a group graph pattern is simply a set of patterns joined together. We 
> > can view a group graph pattern as an operational tree, (AND(x, AND(y, 
> > z))), then we're suggesting that FILTERs be attached to the node that 
> > represents this tree. I'm not sure how to formally state this 
constraint 
> > without mapping the curly braces in SPARQL syntax { ... } to a unary 
> > operator. Such an operator would have no semantics of its own. That 
is, 
> > if P is a graph pattern than GROUP(P) is a graph pattern and 
> > [[GROUP(P)]](D, G) = [[P]](D, G) (i.e., no semantics of its own). That 

> > would allow #4 in definition 3.3 to say something like if P is a graph 

> > pattern and R a value constraint, then ((GROUP P) FILTER R) is a graph 

> > pattern.
> 
> We could define group as a pair : an operator tree (join and left-join 
> expression in the order given) and a set of filters.  Define GROUP(P, F) 
and 
> define [[GROUP(P,F)]] as P FILTER G.

Yes, this should work well and appears to me to be a clean approach. 

> As this is quite central to SPARQL, it might be better to write all the 
> expressions in functional form: JOIN(P1, P2) etc, not infix, becausethe 
left 
> associativity is explicit.

I'd also be happy with this as it should help elucidate the algebra more 
clearly and will also emphasize the translation from the surface syntax to 
the algebra.
 
...
> > Section 5 - Extensions to the Subgraph Matching Approach
> > 
> > 5.1 extends BGP matching for graph patterns that contain blank nodes 
by 
> > adding an explicit mapping function from blank nodes to RDF terms. 
This 
> > is straightforward, but does not treat blank nodes as pure 
existentials 
> > (if I understand correctly, which I may not).
> 
> If the mu mapping is V and theta is B (variable and blank node), it 
> depends on 
> how you define card(V(B)).
> 
> The "implementation hint" says there is one mapping X covering both 
variables 
> and blank nodes, then the solution is that mapping "restricted to just 
the 
> variables".
> 
> If V is the projection of X on a restricted domain, project(X, 
> dom(P)) then it 
> seems natural that card(V) = card(X) : it's more like B parametrizes V 
and
>    card(X) = card(V)*card(B)
> It's as if the query reveals some of the reason behind the choices of B.
> The projection can be moved out to the projection in SELECT.
> 
> If it is V is a function (set of pairs) of the restriction, then it's
> card(V) = 1 (blank nodes do not contribute to counting).  It's 
> distinct(project(X), dom(P)).
> 
> Which style do you prefer?

I see the distinction; thanks for the explanation. Am I correct that the 
latter interpretation is more in-line with treating blank nodes as pure 
existentials? And that the former is the interpretation given in Def. 5.6 
in the latest Semantics of SPARQL paper?

As for my preference, I think I prefer the latter, but as I've said, I 
don't have strong feelings on the cardinality issue and would gladly hear 
what preferences other people have.

> > 5.2 is a cardinality discussion. I must admit that I have been 
negligent 
> > in following the current cardinality discussion initiated by Fred Z. 
and 
> > continued in this past week's telecon and elsewhere -- in any case, 
> > section 5.2 seems very straightforward to me. In particular, the 
formal 
> > definition of cardinality based on the operations that combine BGPs 
are 
> > straightforward, and I imagine would apply regardless of the 
entailment 
> > regime (as long as the entailment regime defines the proper 
cardinality 
> > of solutions to BGPs).  (The authors acknowledge this later in Remark 
6.3.)
> 
> GRAPH does not appear to introduce duplicates.  The proof of 5.9 does 
not 
> cover this but I think that (informally) the substitution ?X->v labels 
each 
> [[P]] in the union in 3.9/3 keeping them distinct.

That's my take on it as well.

Lee
Received on Monday, 30 October 2006 06:26:20 UTC