- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Fri, 27 Oct 2006 16:15:19 +0100
- To: Lee Feigenbaum <feigenbl@us.ibm.com>
- CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Lee Feigenbaum wrote: > > In the hopes that we might be able to base or at least derive many of > our definitions from the work done in the latest paper from Perez, > Arenas, and Gutierrez, I've gone through the paper and scribbled down my > reactions here. Thanks for doing that. I'm leaning towards the multiset + blank node variation of their formalization. > > In summary: I think the formal definitions within this paper are clear > and concise (and as best as I can tell from my examination, accurate). > The paper deals with solutions to BGPs based on subgraph matching, and > hence those definitions are likely not appropriate for the SPARQL QL > specification. That said, those definitions are orthogonal to the > definitions of the algebraic operations that combine sets of solutions, > and I would be in favor of modeling the SPARQL QL specification's > definitions of the operators on this work. BGP matching is a nice formalization of the implementation : their formalization uses two mappings, one for blank nodes, then one for named variables. ... > Section 3 - Syntax and Semantics of General Graph Patterns > > This section presents the algebraic operations that create new sets of > mappings (solutions) from operand sets of mappings. > > In Sec. 3.1, filters are modeled as a binary operator that apply to an > arbitrary graph pattern and a constraint. Our recent discussions apply a > FILTER to a group graph pattern. However, a group graph pattern has no > existence within this paper's way of viewing the world. (In this paper, > a group graph pattern is simply a set of patterns joined together. We > can view a group graph pattern as an operational tree, (AND(x, AND(y, > z))), then we're suggesting that FILTERs be attached to the node that > represents this tree. I'm not sure how to formally state this constraint > without mapping the curly braces in SPARQL syntax { ... } to a unary > operator. Such an operator would have no semantics of its own. That is, > if P is a graph pattern than GROUP(P) is a graph pattern and > [[GROUP(P)]](D, G) = [[P]](D, G) (i.e., no semantics of its own). That > would allow #4 in definition 3.3 to say something like if P is a graph > pattern and R a value constraint, then ((GROUP P) FILTER R) is a graph > pattern. We could define group as a pair : an operator tree (join and left-join expression in the order given) and a set of filters. Define GROUP(P, F) and define [[GROUP(P,F)]] as P FILTER G. As this is quite central to SPARQL, it might be better to write all the expressions in functional form: JOIN(P1, P2) etc, not infix, because the left associativity is explicit. > > Alternatively, these definitions could be used (defined in terms of a > FILTER applied to any graph pattern), and the scoping to a group could > be done entirely via prose, but after spending a few minutes thinking > about it, I'm having trouble figuring out a precise way to state that in > a manner that directly relates back to the algebra. > > ... > Section 5 - Extensions to the Subgraph Matching Approach > > 5.1 extends BGP matching for graph patterns that contain blank nodes by > adding an explicit mapping function from blank nodes to RDF terms. This > is straightforward, but does not treat blank nodes as pure existentials > (if I understand correctly, which I may not). If the mu mapping is V and theta is B (variable and blank node), it depends on how you define card(V(B)). The "implementation hint" says there is one mapping X covering both variables and blank nodes, then the solution is that mapping "restricted to just the variables". If V is the projection of X on a restricted domain, project(X, dom(P)) then it seems natural that card(V) = card(X) : it's more like B parametrizes V and card(X) = card(V)*card(B) It's as if the query reveals some of the reason behind the choices of B. The projection can be moved out to the projection in SELECT. If it is V is a function (set of pairs) of the restriction, then it's card(V) = 1 (blank nodes do not contribute to counting). It's distinct(project(X), dom(P)). Which style do you prefer? > > 5.2 is a cardinality discussion. I must admit that I have been negligent > in following the current cardinality discussion initiated by Fred Z. and > continued in this past week's telecon and elsewhere -- in any case, > section 5.2 seems very straightforward to me. In particular, the formal > definition of cardinality based on the operations that combine BGPs are > straightforward, and I imagine would apply regardless of the entailment > regime (as long as the entailment regime defines the proper cardinality > of solutions to BGPs). (The authors acknowledge this later in Remark 6.3.) GRAPH does not appear to introduce duplicates. The proof of 5.9 does not cover this but I think that (informally) the substitution ?X->v labels each [[P]] in the union in 3.9/3 keeping them distinct. Andy > > Lee
Received on Friday, 27 October 2006 15:15:39 UTC