- From: Lee Feigenbaum <feigenbl@us.ibm.com>
- Date: Thu, 26 Oct 2006 15:42:40 -0400
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
- Message-ID: <OF59BB689C.89ECA77D-ON85257213.0063C75E-85257213.006C4653@us.ibm.com>
In the hopes that we might be able to base or at least derive many of our definitions from the work done in the latest paper from Perez, Arenas, and Gutierrez, I've gone through the paper and scribbled down my reactions here. In summary: I think the formal definitions within this paper are clear and concise (and as best as I can tell from my examination, accurate). The paper deals with solutions to BGPs based on subgraph matching, and hence those definitions are likely not appropriate for the SPARQL QL specification. That said, those definitions are orthogonal to the definitions of the algebraic operations that combine sets of solutions, and I would be in favor of modeling the SPARQL QL specification's definitions of the operators on this work. More specific comments and notices follow: Section1 - Preliminary Notions The definitions in this section are straightforward and easy to understand. The main difference from the rq24 draft is that the paper defines a mapping (a.k.a pattern solution in rq24) as a partial function, whereas rq24 specifies that the domain of a solution is all variables. Section 2 - Syntax and Semantics of Basic Graph Patterns This section defines basic graph pattern matching as subgraph matching. It does not take into account blank nodes in triple patterns. It is clear to point out that the solution of an empty BGP is a single solution with no bindings. These definitions give a clear and concise definition of BGP-matching based on subgraph matching in the absence of blank nodes within triple patterns. Section 3 - Syntax and Semantics of General Graph Patterns This section presents the algebraic operations that create new sets of mappings (solutions) from operand sets of mappings. In Sec. 3.1, filters are modeled as a binary operator that apply to an arbitrary graph pattern and a constraint. Our recent discussions apply a FILTER to a group graph pattern. However, a group graph pattern has no existence within this paper's way of viewing the world. (In this paper, a group graph pattern is simply a set of patterns joined together. We can view a group graph pattern as an operational tree, (AND(x, AND(y, z))), then we're suggesting that FILTERs be attached to the node that represents this tree. I'm not sure how to formally state this constraint without mapping the curly braces in SPARQL syntax { ... } to a unary operator. Such an operator would have no semantics of its own. That is, if P is a graph pattern than GROUP(P) is a graph pattern and [[GROUP(P)]](D, G) = [[P]](D, G) (i.e., no semantics of its own). That would allow #4 in definition 3.3 to say something like if P is a graph pattern and R a value constraint, then ((GROUP P) FILTER R) is a graph pattern. Alternatively, these definitions could be used (defined in terms of a FILTER applied to any graph pattern), and the scoping to a group could be done entirely via prose, but after spending a few minutes thinking about it, I'm having trouble figuring out a precise way to state that in a manner that directly relates back to the algebra. Def. 3.4 defines how solutions satisfy constraints. If we adopt this definition (which is again a clear and consise definition), then we'd need to reference the full semantics of Section 11 to define truth for atomic value constraints. I am a very big fan of definitions 3.5, 3.7, and 3.9. They pretty closely match how my implementation composes solutions, and provide useful terminology (e.g. "compatible mappings") to explain and discuss the algebra. (As an aside which I'll send another note about, I agree with the 4th part of definition 3.9, and it's why I think the filter-scope-3 and filter-scope-4 tests are good tests, but I'll send another note about that.) Section 4 - Semantics of Query Result Forms These definitions are fine and seem accurate, but: a) I haven't seen nearly as many calls for more formal definitions of result forms, so I'd be less inclined to adopt these (though I don't see anything wrong with them on the surface) b) They don't include ASK (which would be trivial to add) or DESCRIBE (which can't be added since the semantics are purposefully vague) Section 5 - Extensions to the Subgraph Matching Approach 5.1 extends BGP matching for graph patterns that contain blank nodes by adding an explicit mapping function from blank nodes to RDF terms. This is straightforward, but does not treat blank nodes as pure existentials (if I understand correctly, which I may not). 5.2 is a cardinality discussion. I must admit that I have been negligent in following the current cardinality discussion initiated by Fred Z. and continued in this past week's telecon and elsewhere -- in any case, section 5.2 seems very straightforward to me. In particular, the formal definition of cardinality based on the operations that combine BGPs are straightforward, and I imagine would apply regardless of the entailment regime (as long as the entailment regime defines the proper cardinality of solutions to BGPs). (The authors acknowledge this later in Remark 6.3.) Lee
Received on Thursday, 26 October 2006 19:42:55 UTC