Re: First order logic and SPARQL from Bijan Parsia on 2010-09-09 (www-archive@w3.org from September 2010)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Thu, 9 Sep 2010 12:44:01 +0100
To: Pat Hayes <phayes@ihmc.us>, www-archive@w3.org
Message-Id: <F91C398A-BE20-4177-B5CA-BEA807913852@cs.man.ac.uk>
On 8 Sep 2010, at 15:58, Pat Hayes wrote:

> OK, not wanting to start another flame war about nonmonotonicity,  
> but just for clarification:

Let's see if adding my message to this thread retracts that desire...

I think you've tied yourself up in knots for no good reason. The  
semantics of SPARQL (by which I mean the set of triples <Dataset,  
Query, Answers> sanctioned by the specification wrt RDF datasets) is  
rather well specified (with some exceptions, e.g., DESCRIBE). In the  
spec, this set is described primarily in terms of an algebra which is  
pretty much the relational algebra. Equivalently, Axel has shown that  
it can be described in terms of  entailment in (nonrecursive, I would  
think for simple entailment and SPARQL1.0) Datalog with stratified  
negation (well, I guess it's automatically stratified :)). If you  
conceptualize the triple as an axiom schema such that Dataset |-  
Grounding(Query, Answer) (where Answer is drawn from the set of  
answers), then we have a clear consequence relation (or something  
reasonably modeled by a consequence relation). This consequence  
relation is clearly non-monotonic.

I would go further and say that this is a bog-standard and very  
useful way to conceptualize and reason about the situation (e.g.,  
complexity results fall out very naturally as do implementation  
techniques). There are other ways of conceptualizing it, but they  
need to capture the same feature, to wit,  there are cases where when  
you extend a Dataset with an additional triple, you lose "an answer",  
i.e. a consequence.

It's useful, even if you don't find it congenial, to talk this way  
even if we didn't have a plethora of model theoretic techniques to  
cope with it. Default logic was understandable without a model  
theoretical account (and, indeed, for many variants, esp. as we get  
more expressive, it's an ongoing challenge to find any, much less a  
useful, model theoretic account).

...
>> Indeed, in this case, you've obscured it. It's pretty clear that  
>> adding a triple to an RDF graph that does not contain it does not  
>> retract anything under the standard reading.
>
> Indeed, and Im sorry I didn't follow your point here, because I  
> agree with it. But I was thinking of a different point. If (as you  
> say, we can do this) we take the SPARQL conclusion to be written in  
> a new language, which we might call SPARQL-RDF (it doesn't make  
> sense to say SPARQL, since it has to be an assertional language)

I'm not sure why you feel this is important. But I don't think we  
have to go "assertional", at least as  I understand it. Yes, of  
course, SPARQL in some sense "asserts" stuff about various RDF KBs.  
But SPARQL doesn't allow you to assert things (as we currently use  
it) about other SPARQL queries. That is, we only see SPARQL in  
consequents (just as in the relational calculus we only see certain  
constructs in the consequents). It's pretty common to say that this  
makes SPARQL a query language :)

(Note that we don't have a deduction theorem. We have no object level  
mechanism for turning an implication (I use this to emphasize the  
inferential rather than semantic guise we're using) into a  
conditional in the object language.

We can, of course do this. But then we're into a more full blow rule  
language (as Axel has done time and again, and well).

> then the truth conditions for this language have to face up to the  
> fact that it refers to RDF graphs (by name, in some cases, but  
> always to some RDF graph) and asserts things about them, like that  
> they have or do not have triples in them.

The data set language (RDF) does not do this. SPARQL does do this  
(e.g., with !bound, and also with the somewhat odd case of GRAPH  
variables, but let's put them aside).

> An RDF graph is a set of triples. Adding a triple to a set creates  
> a *different* set.

But doesn't change the query.

> So when we start talking about monotonicity, we need to be very  
> careful what it means to add a triple to a graph. In RDF this is  
> simply conjunction, but if the language itself can talk about these  
> sets of triples, then adding a triple to the antecedent graph is a  
> lot more than a simple logical conjunction: it involves a change to  
> the antecedent.

Pat, I think you got a bit lost here. If I have the propositional  
formula A and A |- B and then I conjoin something to A, e.g., to form  
A&C, then, well, I have a change to the antecedent. Trivially, right?  
The question is what sort of change and what happens as a result of  
that change. In a monotonic logic if A |- B. then so with A&C |-B. an  
{A,C} |- B.

> This is what I meant by deleting something: again, I apologise for  
> the misleading carelessness in my expression.

Even if I conceptualize adding a conjunction as "Deleting A and then  
adding A&C", or "Deleting {A} and adding {A,C}" it is still the case  
that these formula have a specific relation which I shall imprecisely  
characterise as "A" is a subset of "A&C" (from the correspondence in  
our logics between "A&C" and {A,C}. What we don't have is any  
assertions of the form "C is not asserted (or implied)". That is, "A"  
is not shorthand for {A, C is not asserted, or implied}. It is *true*  
that "C is not asserted (or implied)" which is why the query ~KC  
(where K is an epistemic operator) evaluates to true against that KB.  
That query is evaluated to *false* against A&C or {A,C}. That is, ~KC  
is not entailed/implied/a consequence of the KB. The entailment  
relation |- characterized as a set of pairs <KB, Qs> where Q is  
ground and evaluates to "True" against the KB contains <{A}, ~KC> but  
not <{A,C}, ~KC> nor <{A&C}, ~KC>. Thus, the consequence relation is  
not monotonic. For some KB, there is a KB' such that KB is a strict  
subset of KB' and where <KB, Q> is in |- whereas <KB, Q'> is not in |-.

If I changed my logic so that I could only write what I currently  
write as {A} as {A, C is not asserted or implied}, then, assuming a  
fairly standard entailment relation (e.g., non-paracosnistent), I  
couldn't get to *not* entailing ~KC by merely adding things. I'd need  
to delete "C is not asserted or implied". But that's really not the  
logic we have. The fact that "C is not asserted or implied" is not  
syntactically manifest is precisely what puts the non-monotonicity  
into the entailment relation rather than into the expansion operation.

(I strongly recommend Makinson's "Bridges from classical to non- 
monotonic logic". Very illuminating.)

> I wasn't thinking about 'nulls', which I agree is just another kind  
> of nonmonotonicity. (I think I am talking about what you call  
> epistemic reflection, by the way, though that term is too slippery  
> to be used with confidence.)

I don't see how it's slippery in this case or what one wouldn't be  
confident about it.
[snip]
> So was I. That was written to cover the possibility of extending  
> SPARQL to things like OWL entailment. To use this for SPARQL/RDF  
> entailment requires someone to first give truth conditions for  
> SPARQL/RDF.

Perhaps you would be happy to replace "entailment" with  
"implication"? In any case, Axel has done this and pointed to his  
paper many times. Reduce to Datalog with stratified negation and the  
use any of the standard preferred model semantics (since they  
coincide there).

> Maybe someone has done this: I would be very interested to see it.

See Axel's paper.

>> In any case, the entailment analysis is more or less standard.
>
> I think not. At the very least it would require a model theory for  
> epistemic reflection.

I suspect the sticking point is that you want to reserve "entailment"  
for the semantic consequence relation. Fine. We do have that (see  
Axel's paper).

But we don't need it to describe a non-monotonic consequence  
relation. See the characterization of default logic in terms of fixed  
point equations (just for extensions! then we have whether we go for  
brave or cautious reasoning).

>> If you want to police the language so we say "Non-monontonic  
>> answer sets" or some such, well, ok. I don't think it's useful or  
>> particularly correct or particularly standard (and it's at odds  
>> with wide literatures, e.g., database), but whatever works for  
>> you :) The fact of the matter is that the semantics of not bound  
>> requires a non-monotonic consequence relation or "implicit  
>> retraction" (er...which seems just non-mon :)). And, in point of  
>> fact, you can lose answers by adding information to the dataset.
>
> You can lose answers; well, you lose some and you gain some. But I  
> don't think you lose any entailments, if you state the truth  
> conditions fully.

Would you be happy if we said we lose consequences? We also lose  
entailments wrt Datalog semantics.

> Adding information *changes* the dataset, which the answer *refers  
> to*.  It's not a simple conjunction of new information any more.

See above. It seems very odd to think that because the *question* is  
more sophisticated we've changed the *data*.

Another way to see it is that the query language is asking "Does C  
hold in the e.g., minimal model of the KB"? I don't have to change  
the KB. But the consequence relation which characterizes this query  
language is definitely nonmonotonic.

>>>>> Now, of course, I am being pendantic, since we all know that  
>>>>> this RDF graph is complete, so that if someone isn't listed  
>>>>> there, then they aren't serving on the subcommittee. But *that*  
>>>>> inference is not part of the RDF graph, is not represented by  
>>>>> the RDF graph, s not justified by the semantics of the RDF  
>>>>> graph, and is not used by the SPARQL machinery or justified by  
>>>>> the SPARQL semantics.
>>>>
>>>> I don't see that. (Even before, I thought !bound introduced non- 
>>>> monotonicity.)
>>>
>>> LIke I say, I don't think it does, if we cleave to the strict  
>>> meaning of this term.
>>
>> I don't think it's a matter of being strict. That is, I don't see  
>> a reasonable reading which makes "!bound" monotonic.
>
> Its the one where you agree that a dataset really is a *set*, not a  
> thing with state which can retain its identity while having  
> information added to it.

As should be clear above, no one is appealing to stateful things. The  
nonmonotonicity is expressed in terms of relations between *sets* not  
in terms of mutation.

>>> And I agree I am being pedantic, but then semantics is a pedantic  
>>> business, as Im sure you agree :-)
>>
>> Well, I don't object to pedantry. So let me be clear:
>>
>> 	By "!bound is a non-monotonic operator and thus SPARQL over RDF  
>> is non-monotonic" I mean that
>> 		if we define a consequence relation between RDF graphs and  
>> (e.g., boolean) SPARQL queries
>> 			that consequence relation is non-monotonic under a standard  
>> reading of RDF graphs and their expansion
>> If you change RDF graphs so that every triple not entailed by the  
>> graph is explicitly not in the graph such that adding a triple  
>> requires retracting it's non-existence, I agree that the  
>> consequence relation can be monotonic, but then what seems like  
>> set theoretic addition is, in fact, revision.
>
> I dont want to make that change, but I still disagree that the  
> consequence relation is nonmonotonic, if you state the truth  
> conditions for the SPARQL conclusion precisely, and in a form which  
> acknowledges the specs requirements that an RDF graph (note the  
> singular) is a *set* of triples. And of course that adding  
> something to a set gives you a new set, not the original set  
> 'enlarged'.

I really don't know where you're getting that from. By "expansion" I  
mean the belief revision theory operation (http://en.wikipedia.org/ 
wiki/Belief_revision) which does not involve mutation.

[snip]
>> Well, many wanted it to be more and it is more:
>> 	http://www.w3.org/TR/sparql11-entailment/
>>
>> I don't mean to rehash old battles, but I think this general line  
>> of analysis holds up pretty well and is proving fruitful.
>
> But it does have its problems, such as hallucinations of  
> nonmonotonicity... :-)

I really hope at this point you see that there's no hallucination.

OTOH, I don't know why the question wasn't immediately settled when  
Axel pointed to his paper which showed an answer preserving reduction  
from SPARQL 1.0 to Datalog with stratified negation. That just seems  
to answer every point you might make.

>>>> It doesn't make the consequence relation between RDF graphs non- 
>>>> mon (obviously), but one can lose "NOT EXISTS" answers  
>>>> (entailments)
>>>
>>> Being an answer is not identical with being an entailment.
>>
>> True. An answer is a substitution into a query formula such that  
>> the grounded query is entailed (perhaps with some side conditions)  
>> from the KB.
>
> My point was less pedantic :-). There can be entailments which are  
> not (associated with) answers.

Sure, but then that's a different consequence relation. I can define  
consequence relations using the RDF model theory in a number of ways:

	A) I can restrict the syntactic form of the consequences
	B) I can change how the consequence relation is sensitive to the  
semantics of the graphs
		E.g., finite vs. infinite models
		Minimal vs. all models
	C) I can extend the syntactic form of the consequences
		I can make the extended syntax sensitive to various aspects of the  
RDF models (i.e., combine with B)
			E.g., I can count successor nodes in a relational structure.
			I can pay attention to certain classes of model (as above).
Given a "standard" consequence relation (say, simple entailment), I  
can use these techniques to define a supra-standard (has more  
consequences) or sub-standard standard (has fewer consequences)  
relation or suprasub (missing some and adding some). Furthermore, if  
I have a supra relation, i can change the monotonicity of the  
relation wrt to the added consequences while leaving the "classical"  
or "standard" consequences alone. See (propositional) default logic.

It's reasonable to regard sparql as characterizing a supra-sub  
relation. We restrict some cosnequences (e.g., so that any particular  
query characterizes at most a finite number of answers) and we extend  
others (by having a construct with is sensitive to non-entailment,  
e.g., !bound). Those extended consequences can increase *or decrease*  
when the queried against KB merely increases (that is, we replace it  
with a superset).

Thus, again, that consequence relation is non-monotonic.

> Being entailed (in the above sense) is a necessary condition on an  
> answer, but not a sufficient one.  SPARQL itself omits many  
> entailments from its answer sets. So when the set of answers  
> becomes smaller, it does not automatically mean that the set of  
> entailments has shrunk.

I agree that the set of simple entailments of a dataset increases as  
the dataset increases. But, just as obviously, simple entailment  
alone doesn't characterize SPARQL. Similarly in propositional default  
logic, the set of *classical* consequences never shrinks as you  
extend the classical portion of the KB. But the defaultly inferred  
consequences *can*.

These facts are pretty central facts about optional and !bound.

Cheers,
Bijan.
Received on Thursday, 9 September 2010 11:43:10 UTC