From: Pat Hayes <phayes@ihmc.us>

Date: Wed, 9 Aug 2006 23:25:33 -0700

Message-Id: <p06230926c10052b8b624@[192.168.1.6]>

To: Enrico Franconi <franconi@inf.unibz.it>, RDF Data Access Working Group <public-rdf-dawg@w3.org>

Date: Wed, 9 Aug 2006 23:25:33 -0700

Message-Id: <p06230926c10052b8b624@[192.168.1.6]>

To: Enrico Franconi <franconi@inf.unibz.it>, RDF Data Access Working Group <public-rdf-dawg@w3.org>

>Basically the definition of "Basic Graph Pattern >E-matching" ><http://www.w3.org/2001/sw/DataAccess/rq23/rq24.html#BGPgeneral> >is not equivalent to the expected implementation >as described in ><http://www.w3.org/2001/sw/DataAccess/rq23/rq24.html#BGPsparql>. Actually I think it is, but the issue still needs to be discussed. See end of this message for details. >This means that the definition of "Basic Graph >Pattern E-matching" has to be changed. Or, maybe, play a different, and weaker, role? Please, allow me to make the case for this proposal before dismissing it out of hand. First, it is clear that the implementations use an essentially syntactic criterion for matching, rather than any kind of entailment relationship. This syntactic process is easy to describe, relatively easy to implement, and easy to understand. It can be defined quite precisely in terms of an algorithm. All of these (and other) considerations motivate using this as the *normative defining* criterion in the spec for what constitutes an answer set. However, we need to relate it to the semantics. For simple entailment, this algorithmic definition bears a simple relationship to simple entailment which is easy to establish and is closely related to the core of the current semantic definition: the conjunction of the scoping graph and all answer instances of the query pattern is simply entailed by the KB. This semantic condition then becomes a *necessary but possibly not sufficient* condition on an answer set; that is, it does not *define* the answer set, but merely imposes a semantic constraint upon it. The answer set may be *defined* in some other way - for current purposes, by a matching algorithm - and the spec should then show that this definition does indeed conform to the semantic constraint. The general form of a query semantics in the SPARQL spec then would be that any extended notion of a query answer must (1) be well-defined, *somehow* - we need not say how - and (2) satisfy the general semantic constraint, which is like the above but with a different notion of 'entailment' inserted. This leaves the door open for a wide range of options, including Bijan's suggested restriction of disallowing bnode bindings in the DL cases (although I will continue to argue against this on different grounds, and on a different thread) but also many others, since by demoting the semantic machinery from a parameterized *definition* to a parameterized *constraint*, it allows a wider variety of ways to define an answer set for a particular regime. A possible objection is that this means that the choice of an entailment does not, in and of itself, specify a version of SPARQL. True; but we are already in that situation, since the current definition allows the same entailment to be used with different notions of scoping set: and we recently saw that this could produce misunderstandings about exactly what a particular entailment regime would, er, entail for the definition of 'answer'. And in this suggested framework, at least the actual semantic definitions would have only the single parameter of an entailment type; other differences would be viewed (IMO, correctly) as not changing the semantics but rather being differences in how best to computer answer sets which are semantically appropriate to the entailment regime in use. And finally, this would allow us to move forward with minimal changes, preserving the essential semantic foundations, without needing to wait on getting what seems to be quite tricky, and new, science done. >Jorge Pérez wrote: > >>In the following I will write blank nodes in uppercase and >>uris in lowercase (just to simplify notation). >> >>Consider the graph >> >>G = { (a, p, a), (X, p, Y) } >> >>and the basic graph pattern query >> >>Q = { (?q, p, ?q) }. >> >>I'm considering simple entailment and Q does not have any blank node, so >>we can choose G itself to be the scoping graph and the terms of G to be >>the scoping set, and then any replacement of the single variable in Q >>results in a well formed RDF graph for simple entailment. So applying the >>definitions in the current specification of SPARQL, one obtains that a >>mapping S is a solution if the graph >> >>{ (a, p, a), (X, p, Y), (S(?q), p, S(?q)) } >> >>is entailed (simple entailment) by G. This leads to three distinct >>solutions, S(?q) = a, S(?q) = X, and S(?q) = Y, because this three graphs >> >>{ (a, p, a), (X, p, Y), (X, p, X) } >>{ (a, p, a), (X, p, Y), (Y, p, Y) } >>{ (a, p, a), (X, p, Y) } >> >>are entailed by G. The problem (from my point of view) is that this example >>shows is that the claim currently in section 5.2 in the specification >>(rq24) >> >>"A pattern solution can then be defined as follows: to match a basic graph >>pattern under simple entailment, it is possible to proceed by finding a >>mapping from blank nodes and variables in the basic graph pattern to terms >>in the graph being matched; a pattern solution is then a mapping >>restricted to just the variables, possibly with blank nodes renamed. [...]" >> >>is not true. In my example, because Q does not have any blank node, the >>only mapping that one finds from ?q to the terms of G is ?q -> a. No, bnodes count as terms, so the terms in this example are a, X and Y , so the mappings one finds here are ?q->A, ?q->X and ?q->Y. Which is exactly what the definition gives one; so this does not seem to be a strict counterexample to the wording in the spec, at least on these grounds. However, Enrico's point is well taken, as the usual algorithms in use do not simply 'find a mapping' as the above quote says (and whatever that means exactly :-), but instead perform *pattern matching* against the graph; and if one does that, then the only binding one finds is indeed the (?q-> a) case, since the pattern (?q P ?q) does not match (X p Y). And I think indeed in this case that is exactly what should be the only answer, ie the pattern-matching wins over the current definition. [Later: And the other example in http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2006Aug/0005 makes the basic point more effectively.] However, consider the following example: same query, but against the graph {(a p a) (X p X) (X p b) } Here, I suggest, the answers ?q->a, ?q->X would be correct, in spite of the apparent redundancy, because the third triple clearly distinguishes (what is known about) X from (what is known about) a; so the redundancy is indeed only apparent. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayesReceived on Thursday, 10 August 2006 06:25:49 UTC

*
This archive was generated by hypermail 2.3.1
: Wednesday, 7 January 2015 15:00:51 UTC
*