Re: subgraph/entailment from Bijan Parsia on 2005-09-07 (public-rdf-dawg@w3.org from July to September 2005)

From: Bijan Parsia <bparsia@isr.umd.edu>
Date: Tue, 6 Sep 2005 20:13:16 -0400
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: andy.seaborne@hp.com, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <d461aa279d02ae1d69655a95fb18ec74@isr.umd.edu>
First, let me put myself *firmly* in the "entailment" camp. I got Andy  
(thanks Andy) to change it to that, then Pat got it changed back. I  
gave up arguing with Pat because, though I think it's an *awful* and  
confusing way to specify it, I don't think it's *technically* wrong.  
It's *SO* confusing that it's very easy to see why I get confused on  
it.

So, if we want new information, the confusion it engenders in experts  
due to its non-standard nature seems new. Either serious exposition, or  
a change seems warratned.

On Sep 6, 2005, at 8:53 AM, Enrico Franconi wrote:

> On 6 Sep 2005, at 13:13, Seaborne, Andy wrote:
>>> I don't see why we need to maintain in the document the wording   
>>> "subgraph of"; we could write "entailed by" and say that the type of  
>>>  entailment is decided by the service.
>>> Why not?
>>
>> I am advised, by PatH, that subgraph is sufficient (he did actually  
>> draft this definition in rq23).  In previous discussions in the  
>> working group, using "entails" just begged the question of what sort  
>> of entailment so it left a gap in the definition of SPARQL that was  
>> subject to implementation interpretation.
>
> Well, PatH is plain wrong.
>
> Let me re-consider the example I gave in  
> <<http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/ 
> 0069>.
> Allow me to be sloppy in the syntax.
>
> OWL-Lite ontology, expressed in some RDF-based formalism:
> the class WORKER is declared equivalent to the union of the classes  
> EMPLOYEE and MANAGER:
> WORKER = EMPLOYEE \or MANAGER
>
> RDF data:
> Paul rdf:type WORKER
> Andrea rdf:type WORKER
> Simon rdf:type EMPLOYEE
> Caroline rdf:type MANAGER
> Paul ns:has-friend Andrea
> Paul ns:has-friend Simon
> Simon ns:has-friend Andrea
> Andrea ns:has-friend Caroline
>
> The query:
> Tell me the workers having a friend which is an employee, which in  
> turn should have a friend which is a manager.
>  q(X) :- worker(X), has-friend(X,Y), employee(Y), has-friend(Y,Z),  
> manager(Z).
>
> The answer is the set {Paul}.
>
> There is *no* way to complete the ontology+data in some graph so that  
> the answer to the query would come out from that completion (as a  
> subgraph), since there is reasoning by cases going on.

Let me try to proxy pat for a moment. (Actually, I don't believe any of  
this is in the spec, so maybe it's a useful exercise).

So, what counts as a "completion" of this graph wrt the OWL Lite  
consequence relation? If we think that the completion should be,  
roughly, something corresponding to an interpretation (actually a  
model), then while in every model Paul *will* be q, the *way* he will  
be q will be different. In some models, Andrea is an employee, and  
since she's friends with Caroline the manager, Paul's in. When Andrea  
is *not* an employee (she could be either, but must be one), she is  
friends with Simon the employee friend of Paul. So Paul is still in.  
So, if we think the completion of the RDF graph has to correspond to an  
interpreation like that (which is VERY PLAUSIBLE!!!!), then the  
subgraph of a completion of the graph (or *the virtual graph*) is  
bankrupt. I so argued this at the tech plenary.

The other way is to ask whether *the formula*
	some(Y)some(Z)(worker(paul), has-friend(Paul, Y), has-friend(Y, Z),  
manager(Z)
is in the deductive closure of the above ontology? Clearly yes. So, if  
I want to SPARQL query this ontology what must I do (conceptually, not  
practically)?

	1) Add *all* the (infinite) consequences to the set
		Note that these *won't* be concrete interpretations but have lots of  
existential and 		disjunctive formulae,
	2) Take those formula and serialize them in RDF a la the reverse of  
the transformation to
		triples table in the OWL Semantics and Abstract Syntax
		NOTE! This means you can't query FOL or have an FOL strong query  
language, 		if the transformation has to be a same syntax semantic  
extension (see PFPS's
		ichai paper) While not harmful for OWL, it does seem to be a deal
		NOTE! This puts a HUGE burden on SPARQLOWL folks. I mean, for example,
		do I interpret the syntax triples according to OWL FULL? Ok, the  
answer for me
		is no, but it's seems  to be a minefield
	3) Match the subgraph on this extension
		NOTE! I've not worked out that this will *actually work* given the  
definitions
		It might not if the syntax doesn't *match exactly*.

> See my original email for a technical explanation. Quoting the  
> conclusion of that email:
> "The main thing that this simple example shows is that it is  
> impossible to find a unique completion (a unique deductive closure)  
> over which the query can be evaluated. There are *three* incompatible  
> completions, i.e., none of them is minimal (i.e., none of them is  
> included in all the others)."

I'm confused? The different cases are *not* deductive closures as they  
don't contained *everything* entailed by the original theory. And, in  
fact, they contain things *not* entailed by the theory (i.e., one  
contains that Manager(andrea), which is not entailed by the KB).

> So, if you want to allow SPARQL to handle in the future such (simple)  
> cases like this, where you have (simple) RDF-based ontologies together  
> with plain RDF data, you have to drop the notion of subgraph and  
> resort to entailment. Please note that these cases are already well  
> understood, implemented, and tools are available. By limiting SPARQL  
> now, we will tell the ontology community to use another query language  
> for anything than RDF (and possibly RDFS), while SPARQL would be a  
> natural candidate for any RDF-based ontology language (like, e.g.,  
> OWL-DL).

Now this I buy :)

So, all argument here is future looking. We all agree we can handle RDF  
and RDFS with this wording. In a sense, that may satisfy the charter,  
but I think squeaking by on the charter is a real mistake. I don't  
believe the working group AS A WHOLE made an informed decision. Indeed,  
the history is *understandably* deferring to the last expert who didn't  
give up :)

However, that's not a good way forward (which is unfortunate, since I'm  
on the side of the current expert :)). However, I've talked to a number  
of experts and rising experts (Ian Horrocks, PFPS, Boris Motik, Birte  
Glimm, Jordan Katz, Enrico, Bernardo Grau, etc.). Plus, this isn't the  
"DL" way of doing things, but the standard way to specify a query  
language. Why add to the conceptual load? Esp. if the argument is that  
it *simplifies* the specification!!!! Even if it does in some abstract  
way, it does so with a HUGE, easy to misunderstand deviance from  
standard practice (afaict).

Cheers,
Bijan.
Received on Wednesday, 7 September 2005 00:13:21 UTC