- From: Pat Hayes <phayes@ihmc.us>
- Date: Wed, 9 Aug 2006 23:25:33 -0700
- To: Enrico Franconi <franconi@inf.unibz.it>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
>Basically the definition of "Basic Graph Pattern
>E-matching"
><http://www.w3.org/2001/sw/DataAccess/rq23/rq24.html#BGPgeneral>
>is not equivalent to the expected implementation
>as described in
><http://www.w3.org/2001/sw/DataAccess/rq23/rq24.html#BGPsparql>.
Actually I think it is, but the issue still needs
to be discussed. See end of this message for
details.
>This means that the definition of "Basic Graph
>Pattern E-matching" has to be changed.
Or, maybe, play a different, and weaker, role?
Please, allow me to make the case for this
proposal before dismissing it out of hand.
First, it is clear that the implementations use
an essentially syntactic criterion for matching,
rather than any kind of entailment relationship.
This syntactic process is easy to describe,
relatively easy to implement, and easy to
understand. It can be defined quite precisely in
terms of an algorithm. All of these (and other)
considerations motivate using this as the
*normative defining* criterion in the spec for
what constitutes an answer set.
However, we need to relate it to the semantics.
For simple entailment, this algorithmic
definition bears a simple relationship to simple
entailment which is easy to establish and is
closely related to the core of the current
semantic definition: the conjunction of the
scoping graph and all answer instances of the
query pattern is simply entailed by the KB. This
semantic condition then becomes a *necessary but
possibly not sufficient* condition on an answer
set; that is, it does not *define* the answer
set, but merely imposes a semantic constraint
upon it. The answer set may be *defined* in some
other way - for current purposes, by a matching
algorithm - and the spec should then show that
this definition does indeed conform to the
semantic constraint.
The general form of a query semantics in the
SPARQL spec then would be that any extended
notion of a query answer must (1) be
well-defined, *somehow* - we need not say how -
and (2) satisfy the general semantic constraint,
which is like the above but with a different
notion of 'entailment' inserted. This leaves the
door open for a wide range of options, including
Bijan's suggested restriction of disallowing
bnode bindings in the DL cases (although I will
continue to argue against this on different
grounds, and on a different thread) but also many
others, since by demoting the semantic machinery
from a parameterized *definition* to a
parameterized *constraint*, it allows a wider
variety of ways to define an answer set for a
particular regime.
A possible objection is that this means that the
choice of an entailment does not, in and of
itself, specify a version of SPARQL. True; but we
are already in that situation, since the current
definition allows the same entailment to be used
with different notions of scoping set: and we
recently saw that this could produce
misunderstandings about exactly what a particular
entailment regime would, er, entail for the
definition of 'answer'. And in this suggested
framework, at least the actual semantic
definitions would have only the single parameter
of an entailment type; other differences would be
viewed (IMO, correctly) as not changing the
semantics but rather being differences in how
best to computer answer sets which are
semantically appropriate to the entailment regime
in use.
And finally, this would allow us to move forward
with minimal changes, preserving the essential
semantic foundations, without needing to wait on
getting what seems to be quite tricky, and new,
science done.
>Jorge Pérez wrote:
>
>>In the following I will write blank nodes in uppercase and
>>uris in lowercase (just to simplify notation).
>>
>>Consider the graph
>>
>>G = { (a, p, a), (X, p, Y) }
>>
>>and the basic graph pattern query
>>
>>Q = { (?q, p, ?q) }.
>>
>>I'm considering simple entailment and Q does not have any blank node, so
>>we can choose G itself to be the scoping graph and the terms of G to be
>>the scoping set, and then any replacement of the single variable in Q
>>results in a well formed RDF graph for simple entailment. So applying the
>>definitions in the current specification of SPARQL, one obtains that a
>>mapping S is a solution if the graph
>>
>>{ (a, p, a), (X, p, Y), (S(?q), p, S(?q)) }
>>
>>is entailed (simple entailment) by G. This leads to three distinct
>>solutions, S(?q) = a, S(?q) = X, and S(?q) = Y, because this three graphs
>>
>>{ (a, p, a), (X, p, Y), (X, p, X) }
>>{ (a, p, a), (X, p, Y), (Y, p, Y) }
>>{ (a, p, a), (X, p, Y) }
>>
>>are entailed by G. The problem (from my point of view) is that this example
>>shows is that the claim currently in section 5.2 in the specification
>>(rq24)
>>
>>"A pattern solution can then be defined as follows: to match a basic graph
>>pattern under simple entailment, it is possible to proceed by finding a
>>mapping from blank nodes and variables in the basic graph pattern to terms
>>in the graph being matched; a pattern solution is then a mapping
>>restricted to just the variables, possibly with blank nodes renamed. [...]"
>>
>>is not true. In my example, because Q does not have any blank node, the
>>only mapping that one finds from ?q to the terms of G is ?q -> a.
No, bnodes count as terms, so the terms in this
example are a, X and Y , so the mappings one
finds here are ?q->A, ?q->X and ?q->Y. Which is
exactly what the definition gives one; so this
does not seem to be a strict counterexample to
the wording in the spec, at least on these
grounds.
However, Enrico's point is well taken, as the
usual algorithms in use do not simply 'find a
mapping' as the above quote says (and whatever
that means exactly :-), but instead perform
*pattern matching* against the graph; and if one
does that, then the only binding one finds is
indeed the (?q-> a) case, since the pattern (?q P
?q) does not match (X p Y). And I think indeed in
this case that is exactly what should be the only
answer, ie the pattern-matching wins over the
current definition.
[Later: And the other example in
http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2006Aug/0005
makes the basic point more effectively.]
However, consider the following example: same query, but against the graph
{(a p a) (X p X) (X p b) }
Here, I suggest, the answers ?q->a, ?q->X would
be correct, in spite of the apparent redundancy,
because the third triple clearly distinguishes
(what is known about) X from (what is known
about) a; so the redundancy is indeed only
apparent.
Pat
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32502 (850)291 0667 cell
phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Thursday, 10 August 2006 06:25:49 UTC