- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Fri, 27 Oct 2006 16:15:19 +0100
- To: Lee Feigenbaum <feigenbl@us.ibm.com>
- CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Lee Feigenbaum wrote:
>
> In the hopes that we might be able to base or at least derive many of
> our definitions from the work done in the latest paper from Perez,
> Arenas, and Gutierrez, I've gone through the paper and scribbled down my
> reactions here.
Thanks for doing that.
I'm leaning towards the multiset + blank node variation of their formalization.
>
> In summary: I think the formal definitions within this paper are clear
> and concise (and as best as I can tell from my examination, accurate).
> The paper deals with solutions to BGPs based on subgraph matching, and
> hence those definitions are likely not appropriate for the SPARQL QL
> specification. That said, those definitions are orthogonal to the
> definitions of the algebraic operations that combine sets of solutions,
> and I would be in favor of modeling the SPARQL QL specification's
> definitions of the operators on this work.
BGP matching is a nice formalization of the implementation : their
formalization uses two mappings, one for blank nodes, then one for named
variables.
...
> Section 3 - Syntax and Semantics of General Graph Patterns
>
> This section presents the algebraic operations that create new sets of
> mappings (solutions) from operand sets of mappings.
>
> In Sec. 3.1, filters are modeled as a binary operator that apply to an
> arbitrary graph pattern and a constraint. Our recent discussions apply a
> FILTER to a group graph pattern. However, a group graph pattern has no
> existence within this paper's way of viewing the world. (In this paper,
> a group graph pattern is simply a set of patterns joined together. We
> can view a group graph pattern as an operational tree, (AND(x, AND(y,
> z))), then we're suggesting that FILTERs be attached to the node that
> represents this tree. I'm not sure how to formally state this constraint
> without mapping the curly braces in SPARQL syntax { ... } to a unary
> operator. Such an operator would have no semantics of its own. That is,
> if P is a graph pattern than GROUP(P) is a graph pattern and
> [[GROUP(P)]](D, G) = [[P]](D, G) (i.e., no semantics of its own). That
> would allow #4 in definition 3.3 to say something like if P is a graph
> pattern and R a value constraint, then ((GROUP P) FILTER R) is a graph
> pattern.
We could define group as a pair : an operator tree (join and left-join
expression in the order given) and a set of filters. Define GROUP(P, F) and
define [[GROUP(P,F)]] as P FILTER G.
As this is quite central to SPARQL, it might be better to write all the
expressions in functional form: JOIN(P1, P2) etc, not infix, because the left
associativity is explicit.
>
> Alternatively, these definitions could be used (defined in terms of a
> FILTER applied to any graph pattern), and the scoping to a group could
> be done entirely via prose, but after spending a few minutes thinking
> about it, I'm having trouble figuring out a precise way to state that in
> a manner that directly relates back to the algebra.
>
>
...
> Section 5 - Extensions to the Subgraph Matching Approach
>
> 5.1 extends BGP matching for graph patterns that contain blank nodes by
> adding an explicit mapping function from blank nodes to RDF terms. This
> is straightforward, but does not treat blank nodes as pure existentials
> (if I understand correctly, which I may not).
If the mu mapping is V and theta is B (variable and blank node), it depends on
how you define card(V(B)).
The "implementation hint" says there is one mapping X covering both variables
and blank nodes, then the solution is that mapping "restricted to just the
variables".
If V is the projection of X on a restricted domain, project(X, dom(P)) then it
seems natural that card(V) = card(X) : it's more like B parametrizes V and
card(X) = card(V)*card(B)
It's as if the query reveals some of the reason behind the choices of B.
The projection can be moved out to the projection in SELECT.
If it is V is a function (set of pairs) of the restriction, then it's
card(V) = 1 (blank nodes do not contribute to counting). It's
distinct(project(X), dom(P)).
Which style do you prefer?
>
> 5.2 is a cardinality discussion. I must admit that I have been negligent
> in following the current cardinality discussion initiated by Fred Z. and
> continued in this past week's telecon and elsewhere -- in any case,
> section 5.2 seems very straightforward to me. In particular, the formal
> definition of cardinality based on the operations that combine BGPs are
> straightforward, and I imagine would apply regardless of the entailment
> regime (as long as the entailment regime defines the proper cardinality
> of solutions to BGPs). (The authors acknowledge this later in Remark 6.3.)
GRAPH does not appear to introduce duplicates. The proof of 5.9 does not
cover this but I think that (informally) the substitution ?X->v labels each
[[P]] in the union in 3.9/3 keeping them distinct.
Andy
>
> Lee
Received on Friday, 27 October 2006 15:15:39 UTC