Re: Final text for Basic Graph Patterns from Pat Hayes on 2006-01-18 (public-rdf-dawg@w3.org from January to March 2006)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 17 Jan 2006 21:58:34 -0600
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: Dan Connolly <connolly@w3.org>, Bijan Parsia <bparsia@isr.umd.edu>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <p06230904bff3636d336f@[10.100.0.23]>
>On 17 Jan 2006, at 21:42, Pat Hayes wrote:
>>>>At this point it is becoming too late. We ask that either our 
>>>>text is used (with possible editorial changes to discuss),
>>>
>>>In fact, I though we agreed to do this. I don't know of any 
>>>technincal errors or issues with the proposed text, and many of 
>>>the tweaks and variations being proposed for "clarity" have fall 
>>>down. Hard. There are a *lot* of complex and subtle issues. We 
>>>should go with what *works*.
>>
>>I agree. But I am not at all convinced, and have seen no argument, 
>>that it will work as stated, without any adjustment, for other 
>>entailments, in particular with OWL/RDF syntax and OWL entailment. 
>>I am pretty sure it will not, in fact. (The syntactic restriction 
>>to bnodeIDs in the dataset graph, that is hard-built-in to these 
>>definitions, seems to me to be likely not be appropriate for 
>>OWL/RDF, which uses RDF bnodes in lists to encode OWL syntax. At 
>>the very least, it is possible to take a rational position which 
>>would make it inappropriate. If RIF finishes up with a rule 
>>language that allows terms to be composed by matching, then this 
>>restriction will be disastrously inappropriate.)
>
>In fact, you are somehow right.
>But, you probably did not pay attention to our text:
>"In fact, with the proviso that bnodes are not allowed to appear in 
>pattern solutions, SPARQL may be extended to provide a way to 
>override the default "simple entailment" with "RDF entailment", 
>"RDFS entailment" (as defined in [RDF-MT]), and "OWL entailment" (as 
>defined in [OWL-Semantics], where the syntactic restrictions in 
>OWL-DL or OWL-Lite should be reflected in suitable syntactic 
>restrictions on the form of basic graph patterns)."
>
>So, we suggest an upward compatible extension *without* bnodes in 
>the answer, because it is the only one which we know is going to 
>work for sure (it is easy to see this). As our latest interactions 
>proved, it is an open research problem how to have a query language 
>with bnodes in the answer based on entailment which can also 
>smoothly work with simple entailment.

Hmm, but then answers will definitely be wrong in many cases, surely. 
At the least we should not call this rdf-, rdfs-, etc. querying, 
since it applies only to a very restricted form of such querying.

>>Given that the entire point of writing these definitions is to 
>>provide a smooth seqway to OWL and Beyond; and given that the parts 
>>of the current definition which are potentially problematic are not 
>>semantically motivated but rather are concerned chiefly with 
>>reducing redundancy - which itself is an issue whose meaning 
>>changes when we change the notion of entailment, so will almost 
>>certainly need to be revisited when OWL-SPARQL is finally cast in 
>>stone - given all this, it seems to be worth spending some effort 
>>to try to find a way to phrase them which is less likely to get 
>>caught on some detailed snags later.
>
>We believe that our text provides such solution.

I am still bothered by the possibilities of OWL-specific notions of 
redundancy in answer sets. Consider an OWL/RDF KB that asserts

:a rdf:type :A
:a rdf:type :B
:a rdf:type :C
:A rdf:type owl:Class
:B rdf:type owl:Class
:C rdf:type owl:Class

and an OWL query

SELECT ?x WHERE {:a rdf:type ?x .}

Now, this dataset OWL-entails the *existence* of a triple 
intersection and three double intersections, all with :a in them. So 
are these reasonable answer bindings for such a query? I see no good 
reason why they should not be: in fact, one could reasonably take the 
position that the *only* non-redundant answer here would be to bind 
?x to a term representing the most restrictive intersection. But to 
construct such a term would require the use of the RDF collection 
vocabulary when presenting the answer.

If you would argue that it is outside the scope of a query engine to 
construct such a term, consider

:a rdf:type :A
:b rdf:type :B
:c rdf:type :C
:A rdf:type owl:Class
:B rdf:type owl:Class
:C rdf:type owl:Class

SELECT ?x WHERE {:a rdf:type ?x .
:b rdf:type ?x .
:c rdf:type ?x .}

Surely it would be OWL-incorrect to give no answer here? After all, 
such a class certainly does exist, so the corresponding existential 
query

ASK {:a rdf:type _:x .
:b rdf:type _:x .
:c rdf:type _:x .}

should surely be TRUE, and under those circumstances it hardly seems 
sensible to refuse to provide at least a bnode answer to the SELECT 
query.

>In a sense, I agree with you that we would prefer to have time to do 
>*research* and solve this interesting open research problem. But a 
>working group aiming at a standardisation can not do that.

Quite: which is why SPARQL should stick to its charter :-)

>  We just use what we know for sure at best.
>
>>We've already seen that premature optimizations can come back to 
>>bite us (no literals in subject position, no bnodes in property 
>>position) and I really don't want something we do almost as a 
>>side-effect in SPARQL, and that is not required by our charter, to 
>>be used as a lever to limit what RIF is going to be able to do. 
>>("We can't have rules like that or SPARQL wouldn't work right.") 
>>I've been trying to find a more robust way to state the conditions 
>>which would clearly separate the necessary semantic conditions on 
>>answers, from aspects of the definition that are there only to 
>>reduce redundancy. These are tightly woven together in these 
>>definitions at present.
>
>Again, your motivations are good. Our characterisation covers 
>*exactly* subgraph matching, and can be smoothly extended (if bnodes 
>are not in the answer) to other entailments (including OWL-DL). This 
>makes everybody happy.
>
>[Technical conjecture:] Apparently, when bnodes are involved in the 
>answer, there is a substantial incompatibility between the semantics 
>that is needed for simple entailment (namely the semantics needed to 
>have a syntactic reading of the graph), and a semantics that would 
>be needed when standard forms of entailment are involved (e.g., RDF 
>entailment, OWL-DL entailment, etc). The latter could be 
>characterised with entailment+possibly some form of minimisation. 
>However, the discussions of these months showed that a semantics 
>based on proper entailment can hardly capture the syntactic nature 
>of simple entailment.

Interesting, I really do have different intuitions from you, I think. 
All entailments (below full 2nd-order logic) have SOME kind of 
syntactic account, which is why we can have completeness theorems. 
And I don't see simple entailment as being in a special separate 
category from 'proper' entailments: it is merely a very simple 
entailment form, complete with respect to a very simple semantics. So 
I see more of a continuum, with bnodes playing the same kind of role 
in all of them. In fact, one can characterize RDF simple entailment 
as the entailment corresponding to just the basic semantics of RDF 
graphs as such, the pure existential/conjunctive fragment, with no 
extra vocabulary conditions imposed on it: it really is almost 
entirely about bnodes. It is this very simplicity that made it 
feasible to construct the whole RDF layer-cake without having any 
explicit scope markers, a design decision that is now of course 
coming back to haunt us, when SPARQL has at least three distinct 
scopes in answer sets to keep track of.

>This is all due to the presence of bnodes in the answer set.
>
>>Let us agree that ANY wording we finish up with MUST support the 
>>same results for basic SPARQL patterns, i.e. for the simple 
>>entailment case.  In fact, I thought we had already agreed this 
>>some time ago. So far, they all do. This will be the only normative 
>>case in the current document. This should have been enabling 
>>progress to go forward on the algebra, independently from these 
>>discussions.
>
>Our current text can be proved to be equivalent to subgraph 
>matching, and it is normative only for the simple entailment case. 
>So we agree on everything, do we?

Well, I still prefer treating bnodes as 'blank variables', as it is 
clearer and simpler: and since you are proposing (you are correct, I 
had either not read this correctly or had forgotten, or maybe both) 
that bnodes in queries be ruled out for 'higher' entailments, and the 
definitions are equivalent up to RDFS, it seems that this choice is 
not too important either, right? But apart from this, I have no real 
quarrel with the definitions as stated.

I'd be interested in your reaction to the suggestion for how to 
define 'extensions'. I tried to keep the basic structure of the 
definition, but remove just the parts that might need to be 
reconsidered more carefully. But in the light of the above, perhaps I 
was trying to be too ambitious.

Pat


>
>cheers
>--e.


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 18 January 2006 03:58:44 UTC