The Entailment bit (was Re: thoughts from Tuesday telecon) from Bijan Parsia on 2005-09-23 (public-rdf-dawg@w3.org from July to September 2005)

From: Bijan Parsia <bparsia@isr.umd.edu>
Date: Fri, 23 Sep 2005 19:38:41 -0400
To: Pat Hayes <phayes@ihmc.us>
Cc: Dan Connolly <connolly@w3.org>, RDF Data Access Working Group <public-rdf-dawg@w3.org>, Ian Horrocks <horrocks@cs.man.ac.uk>, Enrico Franconi <franconi@inf.unibz.it>, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Message-Id: <e2a745c78be080007b3f8d081013b5d9@isr.umd.edu>
On Sep 22, 2005, at 6:03 PM, Pat Hayes wrote:

> Here is my attempt to summarize the main points that were agreed on, 
> more or less, in the telecon discussion on Tuesday. I may not have got 
> them all, so if others' memories differ, please correct me. Also, this 
> does not claim to be a transcript, and it does not cover all the 
> issues that got raised, and some of it reflects thinking Ive done 
> since the telecon, including a suggested rewording of the 
> 'skolemization' description of bnode handling.
>
> Enrico, there is a direct question for you at @@@ below.
>
> -------
>
> 1. We can distinguish between the basic matching operation between a 
> simple query pattern and the KB, and the various filtering and 
> selections that may be done to the results of that basic operation, 
> which amount to treating the results as a table (or tables) and 
> manipulating it (them) in various ways, essentially algebraic. Most of 
> the discussion concerns that basic operation and how it should be 
> described. It is agreed that all the subsequent filtering or 
> value-checking operations on the result tables are NOT best regarded 
> as entailment.

Yes. Although I would add that I would like a clear presentation of the 
algebra in, perhaps, an appendix or even a separate document. E.g.,

	A list of primitive operators (obviously, there will be several 
choices, but still)
	Derivation of the rest
	The algebreic laws governing these operators
	Perhaps, a mapping into SQL or the relational algebra per se of those 
operators.

> 2. Right now the basic operation is described as follows. A 
> substitution B is a mapping from variables of a pattern Q to URIrefs, 
> literals or bnodes, and an answer is a binding with B(Q) a subgraph of 
> KB. There seems to be a strongly and widely held opinion that it would 
> be less confusing, or at least helpful to many folk, to describe this 
> in terms of simple entailment

Almost, I would like the wording to be "entails" where where the 
entailment relation may vary. I think the document to explicitly list 
simple with told-bnode reduncency, simple, and RDF entailment. I think 
it would be nice if it listed RDFS and the various owl entailments. 
Providing URIs for these would be excellent!

> rather than as matching to a subgraph: an answer is a binding B such 
> that B(Q) is simply entailed by KB. The two are nearly equivalent: see 
> below for more discussion.
> Moreover, many feel that this use of entailment language at the heart 
> of the specification would provide a better (possibly, an essential) 
> segway between SPARQL and future extensions of SPARQL to other 
> languages, by replacing simple or RDF entailment with more complex 
> notions of entailment, particularly OWL entailment.

Yes. I would like this to be in the spec as how you do that. This is 
independent of how you indicate what's in force for a particular query 
incident.

> 2a. The question arose: if SPARQL can be used (perhaps in the future, 
> but let us look ahead) with various notions of entailment, should a 
> query be able, or be required, to specify the kind of entailment 
> intended? Although this was discussed only briefly, there was a 
> consensus that it would be acceptable for the entailment in use to be 
> specified as part of the 'service', identified by the URI currently 
> used to name the target graph.

While I think that is a reasonable point in the design space, I talk 
with Jim and he flat out rejected it. So there's still play here. (I 
myself don't think it's the best design but do think it's workable.)

>  This means that the query language and protocol would not need to be 
> concerned directly with specifying which entailment is supposed to be 
> used.

Well, given that you identify endpoints with certain properties, I 
think the protocol is involved, but in a minimal way.

> Note however that this does go beyond the idea of this URI identifying 
> a virtual RDF graph, since for example OWL entailment cannot be 
> naturally reduced to RDF entailment from an OWL closure, even an 
> infinite virtual closure.

Well, it sorta can by taking the owl deductive closure if all the 
formulas 1) have a triple syntax (which they do) and 2) those syntax 
triples are part of the deductive closure. However, this makes it 
difficult to query OWL DL closures as such. I.e., you can't shove the 
triples in and have it just work out (you'll get more answers). So, 
pragmatically, it does go beyond.

> The spec should therefore be written so that querying against a URI 
> which identifies an OWL-querying service can provide bindings B such 
> that B(Q) is OWL-entailed by the KB, not (as currently) bindings B 
> such that B(Q) is simply entailed by a virtual RDF graph.

Yes.

> The important theoretical point here is that reducing X-entailment to 
> simple entailment by an X-closure has proven to be a useful technique 
> for RDF and RDFS, but is much less naturally extendable to languages 
> like OWL which have a nontrivial disjunction or union operator. (We 
> might argue about whether it is always *possible* to gloss 
> X-entailment as simple entailment from an X-closure, but it seems 
> clear that even if it can be done, it will be far clearer and more 
> natural not to bother, but instead simply refer to bindings which 
> achieve OWL entailment.)

Ah, strike the above. we agree.

> We are indebted to Enrico for making this point vividly clear with the 
> 'little-house' (aka 'worker') example.
>
> (I would suggest, and this is purely my own personal view, that we can 
> adopt a compromise here, in which SPARQL in its current release will 
> refer to simple entailment; the issue is pointed out; the actual spec. 
> refers to virtual graphs identified by URIs, and refers to the RDF and 
> RDFS closure lemmas; and the possibility of using URIs to identify 
> services which offer other kinds of entailment is pointed out as a 
> future extension path.

Hmm. Doesn't this bias things against Jim's desire for one and the same 
URI identified graph to be queried under different semantics? In other 
words, does this close discussion on that protocol design decision 
before alternatives have been considered?

> This allows conforming implementations which support only simple 
> matching and perform no inferences, and it allows more advanced 
> implementations certainly up to RDFS and perhaps OWL lite, and has a 
> very simple and straightforward extension path to the use of SPARQL 
> with more sophisticated OWL engines simply by rewriting part of the 
> text of the spec, without requiring changes to the actual SPARQL 
> language or protocol.)
[snip]
Isn't this the same as what I want above? I'm fine with the current 
draft only covering RDF (3 sorts of entailment), but would prefer that 
at least RDFS gets in. However, I see no technical reason not to grab 
the whole bag.

There is a charter prohibition, but I would propose altering that as it 
all comes out so nice.

One question worth answering is whether there will be implementor 
support at this time. I believe I can pledge that Pellet will support 
SPARQL over OWL DL. Indeed, if Jena's SPARQL implementation separates 
the graph matching and the rest of the algebra, I believe it's a tiny 
hookup for us. Ian can speak about Manchester intentions. I believe 
Boris would support at least the core query bits (if not all the XQuery 
bits). The racer folks have a rather different implementation, I 
believe, so I don't know what they'd do. I could take an action to 
investigate this, if the group so desires.

I'll note that this does meet, afaict, the needs of the OWL community 
for an ABox query language. TBox ("schema") query (a la the DIG 
protocol) is left for future work (as is appropriate). In RDF and RDFS, 
both sorts of query are answered by the virtual graph like approach.

Cheers,
Bijan.
Received on Friday, 23 September 2005 23:38:54 UTC