Re: Semantics necessary not sufficient (was: Re: What is "the serious bug in entailment semantics" found by J. Perez"?) from Bijan Parsia on 2006-08-11 (public-rdf-dawg@w3.org from July to September 2006)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Fri, 11 Aug 2006 22:28:25 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: Enrico Franconi <franconi@inf.unibz.it>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <9887AE6B-FEBB-45D4-993A-85B2E8E30637@cs.man.ac.uk>
On Aug 11, 2006, at 9:30 PM, Pat Hayes wrote:
[snip]
> Indeed, it would not. However, there would be a semantic constraint  
> based on entailment, which we could state first, and to which the  
> definition of answer set would be required to conform; and I  
> believe that this would be a better way to craft the design of the  
> spec. Basically, my motivation is that it seems to me that getting  
> an entailment definition exact, i.e. defining the answer set  
> *exactly*, and *purely* in terms of entailment, is not a fruitful  
> goal for us to be pursuing at this point in the state of this art.  
> It does not seem to be necessary for SPARQL to do this. It will not  
> be more precise than a procedural definition, only different in  
> style. We have a good chance of getting it subtly wrong: after all,  
> we already have done that several times. Even if we get it right,  
> the result will likely be so opaque that almost all implementers of  
> SPARQL engines will be obliged to use a simpler, more procedural,  
> description as their actual guide. You have indicated that you, in  
> fact, are already doing this: the goal of this exercise is to craft  
> a semantic/entailment definition which will agree with a procedural  
> description abstracted from current implementations. This is  
> amounts to reverse-engineering a mathematical description from a  
> procedural one, and I really don't see the point of doing this for  
> the spec documents unless it is likely to produce some new insight  
> or simplification or a better exposition; none of which seem likely  
> at this point.

For what it's worth, once I hacked through the horrible exposition in  
the current document (with help from Enrico and Sergio) for my  
tutorial, there were some interesting insights (as you mention below,  
scoping sets are cool). The definition of an E-Entailment regime is  
similarly useful I think.

Of course, their major use is in definitions of alternative  
semantics. I think this point should not be put aside. Even within  
the bounds of our charter, we have simple, rdf, rdfs, and let's call  
it "assertional"/matching (possible) entailment relations to  
consider. I think the "virtual graph"/deductive closure approach  
suffers from a number of problems, and prefer an approach which  
refers to the semantics of the relations, which opens the door for  
extensions to OWL. Or I prefer an approach where each is defined  
distinctly (though I prefer that less than the general approach).

> If anything, the definitions we have been crafting simply make the  
> basic idea of pattern-matching more and more opaque. I do not mean  
> to argue against having a semantic analysis, in support of semantic  
> interoperability: but interoperability is served by the answer set  
> being (1) unique, and (2) semantically coherent. For the latter, it  
> is not necessary that the entire answer set be *defined*  
> semantically, only that the answers in it be required to *satisfy*  
> appropriate semantic conditions.

This is true. But we want some guarantees that the procedural  
definition yields the semantically correct answers.

[snip]
> I have another motive for this suggestion. As you may have been  
> able to figure out from my recent emails with Bijan, I am strongly  
> in favor of allowing SPARQL to deliver redundant answers when it is  
> dealing with a redundant KB, hence not requiring absolutely  
> irredundant answers.

Though these should, IMHO, be available from the language. Otherwise,  
it's very hard to say that we are truly doing RDF query. Of course,  
there are cases where you want access to the non-redundant graph *per  
se* (e.g., editors), but I'm not as sanguine about that as i use to  
be. Browsers (in the sense of portals) and editors are very  
different. Thus, I believe the problem of ensuring bnode stability  
thoughout a "session" and getting exactly the redundancy in a graph  
should be separated. In fact, we should only use the latter to allow  
for specifying a standard for "acceptable" redundancy that is  
consistent across implementations (as Pat discusses below).

> On the other hand, it is clearly desireable to limit the unbounded  
> amount of redundancy that a simple entailment condition would  
> allow. Stating an appropriate compromise position between these  
> extremes is  difficultwhen we limited to using only semantic  
> notions and terminology, IMO largely because the very idea of  
> redundancy here is 'semantically invisible', i.e. a redundant graph  
> and its lean subgraphs are semantically indistinguishable. Hence  
> the need to protect the entailment-based definitions with notions  
> which really are not semantic at all, such as the scoping set. In  
> contrast, these scoping ideas are trivial to express in an  
> algorithmic framework, and the results are intuitively very clear  
> and easy to understand. So I think that this way of dividing up the  
> definitional work between a semantic necessity and a syntactic/ 
> algorithmic sufficiency might allow us to quickly and easily find a  
> way to deal with redundancy in answers which will be more or less  
> right for practical deployment.
[snip]

I do think that this should be *available* and the *default*, but i  
think too that if we don't make the irredundent answer sets available  
(if only by special user demand) then we aren't cohering with the  
semantics of RDF. I'll raise an issue and explain the point.

Cheers,
Bijan.
Received on Friday, 11 August 2006 21:28:32 UTC