- From: Sergio Tessaris <tessaris@inf.unibz.it>
- Date: Mon, 2 Jan 2006 16:34:11 +0100
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
I'd like to add some comments to Enrico's proposal to explain the differences and similarity with Pat proposal in <http://lists.w3.org/ Archives/Public/public-rdf-dawg/2005OctDec/0340>. For the sake of clarity I wont discuss the "told bnodes" issue; anyway, as Pat pointed out it can be added "for free" (it may impact syntax and protocol, though). Pattern Solutions ----------------- Pat's and our definitions disagree on the restriction to be imposed to the domain of pattern solutions. In Pat's definition the range is restricted to terms appearing in the dataset (active domain), in our definition this restriction is in place for bnode names only. For simple entailment the two definitions coincide; the difference shows up with RDF(S) entailment. It boils down to whether axiomatic and "implicit" triples involving RDF(S) vocabulary show up in variable bindings (see <http://lists.w3.org/Archives/Public/public- rdf-dawg/2005OctDec/0388> thread for more info). I'd like to add that, if the more restrictive approach is going to be chosen, systems that automatically perform an RDF(S) completion of the graphs should make extra checks on the results before returning answers to be sure that they're not returning bindings with RDF(S) vocabulary terms not present in the original graph. Basic Graph Matching -------------------- Pat proposal boils down to two candidate approaches: 2.5 Basic Graph Matching. A basic graph pattern P matches on graph G with solution S when S is a pattern solution on P to G such that a) G (simply) entails S(P) b) G (simply) entails (G union S(M(P))) where M is a 1:1 mapping on blank nodes which replaces each bnode in P with a new blank node which does not occur in P or G In the definition I removed any reference to told (or persistent) bnodes. As already pointed out in the mailing list, the option a) has some counterintuitive implications that, in my opinion, make it unsuitable for a proper formalisation of SPARQL. Consider the following example: Data: _:a :p _:b _:b :url <http://example.org> Query: SELECT * { ?x ?p ?y } Answer set with option a): ?x/_:a ?p/:p ?y/_:b ?x/_:b ?p/:p ?y/_:a ?x/_:b ?p/:url ?y/<http://example.org> ?x/_:a ?p/:url ?y/<http://example.org> ?x/_:b ?p/:url ?y/_:a ?x/_:a ?p/:url ?y/_:b The problem is that, even if the bnode names are taken from the graph, their names in the left and right hand sides of the entailment don't matter. Option b) doesn't have this problem because G is considered together with the BGP on the right hand side. Roughly speaking, in option b) reusing a bnode in the "wrong" place would create an additional co- reference not present in the original graph. I don't think we have really two options, since I don't see why anybody would consider option a) behaviour the "right" one. Note that the fact that this has nothing to do with the told bnodes issue, which is somewhat orthogonal. Aside, with the b) approach told bnodes come for free. Now, our definition is slightly different (see Enrico's <http:// lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0000>), being (in our opinion) the simpler: G (simply) entails S(G RDFmerge P) Apparently, it's different from Pat's G (simply) entails (G union S(M(P))) but in fact they're the same. To see it, consider that (G union S(M (P))) is (syntactically) equal to S(G union M(P)) because G doesn't contain any variable. Moreover, (G union M(P)) is equivalent (up to bnodes renaming) to (G RDFmerge P) because of the definition of RDF merge. If we're not interested in bnode persistency, then the RDF merge as defined in [RDF-MT] is enough (there's a minor detail in the fact that P contains variable nodes). Our definition of RDF merge takes into account that bnodes in the dataset shouldn't be changed. I hope that this email clarifies our position on the BGP matching issue. Cheers, --sergio
Received on Monday, 2 January 2006 15:34:34 UTC