Re: subgraph/entailment from Pat Hayes on 2005-09-20 (public-rdf-dawg@w3.org from July to September 2005)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 20 Sep 2005 15:54:32 -0500
To: Bijan Parsia <bparsia@isr.umd.edu>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <p06200705bf5617910c98@[10.100.0.9]>
>On Sep 19, 2005, at 5:32 PM, Pat Hayes wrote:
>
>>>On Sep 19, 2005, at 2:55 PM, Enrico Franconi wrote:
>>>
>>>>>From: Pat Hayes <phayes@ihmc.us>
>>>>>
>>>>>If one wishes to offer a query-answering service that does check 
>>>>>entailments, then the appropriate thing to do within the SPARQL 
>>>>>framework is to declare that one is matching queries against a 
>>>>>'virtual graph' which is some kind of logical closure of the 
>>>>>graph, rather than the graph itself. But these are different 
>>>>>graphs, note.
>>>>
>>>>Sure, we clearly understand that.
>>>
>>>Though it took some work to get there. I've been testing this 
>>>account with various people who are actively concerned with 
>>>querying, albeit typically against more expressive languages such 
>>>as OWL. *Serious* confusion ensues.
>>
>>Hmm, I wonder why. This has seemed kind of obvious since early in 
>>the entire RDF process, surely?
>
>Perhaps not.
>
>>  BTW, have you tried talking to people who are familiar with SQL querying?
>
>Yes.
>[snip]
>>>And so, I hope, give some evidence that it is entailment-neutral 
>>>in fact and in practice. I have users who would like to use SPARQL 
>>>to query documents understood with "told", simple, RDF, RDFS, and 
>>>OWL semantics.
>>
>>The ones who want it to have OWL semantics have a lot more work to 
>>do. BTW, I take it that when you say "use with X semantics" what 
>>you mean is that you want it to be the case that a query Q succeeds 
>>against a graph G when G X-entails Q, right?
>
>Well, some variant of that, yes. Typically, a query Q succeeds 
>against a graph G when G X-entails a substitution of Q (that 
>substitution is the binding).

Well, sure: but there's no need to mention the binding in this case 
as all these various entailments extend simple entailment. So if we 
go with the entailment way of stating it, then we needn't bother with 
talking about bindings and substitutions explicitly.

Entailment-talk is indeed simpler, once you are so familiar with what 
it means that it doesn't need to be explained further. If you are not 
so familiar with it, however, you have to look it up in order to 
discover that it means that an instance of the pattern is a sub-graph 
of the KB. Then you have at least a handle on how to implement it, 
whereas being told that it is entailment tells you essentially 
nothing.

(I am astonished to hear that so many expert and competent logicians 
found this definition confusing, by the way. It might have been 
slightly unfamiliar, but confusing? Really?? Given Enrico's masterly 
summary, he doesn't seem to have been confused for long.)

>I think that for SPARQL a kind of two level approach is in the 
>offing, where graph patterns and solutions are specified in terms of 
>entailment and then the rest of SPARQL is treated like an alegbra on 
>the results sets. Since results sets are more like tables and the 
>things sparql can do are pretty expressive (and a tad hairy) I'm 
>feeling pretty comfortable with this approach.

It seems to me like a dog's dinner. If the 'rest' of SPARQL (which is 
virtually all of SPARQL) is an algebra, why not treat the entire 
thing as an algebra? Hardly any actual uses of SPARQL are going to 
fit into the tiny core that is legitimately describable as entailment.

But this is essentially an aesthetic judgment, and my taste seems to 
be in the minority, and aesthetics are not really important anyway in 
specs document, OK.

>>  Because if all you mean is that G might contain some X content, 
>>and SPARQL will reproduce this content at least to the extent of 
>>not distorting it, then I would claim that SPARQL works correctly 
>>right now for all these languages. (Well, let me modify this. We 
>>could do a better job on RDF collections, maybe. But that isnt the 
>>point being argued in these threads.) Of course, if you want to use 
>>SPARQL to query an OWL-RDF graph, then you had better know OWL-RDF 
>>syntax and conventions, and be prepared to do some OWL-savvy work 
>>on the answers you get back in order to have them make OWL sense.
>
>I am somewhat concerned with people who are manipulating OWL 
>documents in various ways.
>
>>  But then, surely you should not expect otherwise: SPARQL is an RDF 
>>query language, not an OWL query language.
>
>I don't find that satisfactory. I'm prepared for their to be limits, 
>but I would like the extensions to be straightforward where 
>reasonable

They are straightforward up to RDFS. They are not completely 
straightforward to any language, like OWL, that has a nontrivial 
disjunction operator. But this, I would claim, has to do with the 
nature of OWL inference, not the task of OWL querying. It would be 
possible to make an OWL-SPARQL extension which would treat OWL 
legitimately, be OWL correct (and would deal with things like the RDF 
uses of rdf:collections to encode owl:unionOf and so on, and all the 
DL restrictions and other OWL complications), and be in the same 
spirit as SPARQL: and yet it would be incomplete wrt OWL entailment; 
and it would, IMO, actually be of more utility as a consequence. Once 
you deal with expressive languages, inference becomes significantly 
costly, and once it does, there is a pragmatic need to distinguish 
between querying what is actually *in* a KB , from asking what 
*follows from* a KB. The difference in cost is too great for it to be 
acceptable to simply subsume the former as a special case of the 
latter. If you like, think of it as providing the ability to make a 
'quick check' of the KB. In the 'little house' example, the quick 
OWL-SPARQL response should be that there are no matches. And I would 
defend this as correct and appropriate for an OWL query (*query*) 
language, and will predict that once an OWL query language is 
designed, users will insist that this option - of, essentially, 
switching off entailment other than simple entailment - be made 
available for some applications. If you know already that your 
information is formatted in some way, or closed in some way, and if 
there is a lot of it, and you are in a hurry, the last thing you want 
is to be *obliged* to wait for some inference engine to spend 
o(e|n|2) cycles checking to see if it can find another answer by some 
devious analysis by cases.

BTW, this is a very old point, and a very basic one in AI/KR. Its 
really an instance of the idea of 'logic+control' that Kowalski and I 
both argued for when logic programming was being invented. Logic is 
great, semantics are vitally important: but they aren't enough by 
themselves. You have to give users a way to control the logic, when 
they know more about the information than the inference machinery 
does. Querying shouldn't just be a matter of tossing a goal sentence 
to a theorem-prover.

>and not artifical when impossible. Just as RDF (sorta almost) plays 
>the role of a "data", "assetional", "abox" language for OWL-DL, so 
>too SPARQL should be able to query that aspect of OWL-DL documents 
>and be correct wrt the OWL-DL semantics.

Well, I claim that is correct. It may not be - is not - complete, in 
some sense: there will be valid OWL inferences it does not perform 
for you, because it doesn't do inference at all, but it won't distort 
or break anything.

>Even if the working group isn't taking the complete specification of 
>this in this round, every discussion I've had with members (except 
>this email exchange) has indicated that it was on the list for the 
>next round and that SPARQL is intended to be foward compatible.

It is forward compatible. But it is also an RDF query language. This 
exchange isn't getting anywhere, probably because terms like 'forward 
compatible' don't have any useful clear meaning. Can you be more 
precise?

>Point for point is perhaps not the happiest at this point as it gets 
>tiresome for onlookers. I've read your message and am prepared to 
>discuss it tomorrow.
>
>There is one additional bit I wish to respond to in this reply though.
>[snip]
>>>An entailment based account could help (and has the advantage of 
>>>being familiar to the implementors I've chatted with for those 
>>>more expressive languages
>>
>>Look, we can all swap anecdotes about people who we have chatted 
>>with. Such talk is pointless.
>
>And yet you go on to talk such talk.
>
>>For myself, apart from the small group of logic police,
>
>I hope that you will be more judicious with your words while you are chairing.

You guys are soooo sensitive. Would "logic mafia" (a more widely used 
term) be better?

>>  whose objections I can script in advance, *nobody* I have chatted 
>>with has evinced the slightest interest in any formal semantic 
>>issues at all. They tend to regard such matters as arcane academic 
>>baloney.
>
>Pat, I have users and I have colleagues and I have collaborators. I 
>represent an organization that pays fees to be a member in the W3C 
>and supports semantic web work with an enormous amount of time and 
>effort at many levels. We have priorities and mandates and desires. 
>My job is to represent them in this group.

OK, fair enough. Just out of interest, what organization is that?

>When I talk to implementors like Boris Motik (Kaon2) whose system I 
>am expected to evaluate for my users, and he, a very capable query 
>person, is baffled and repelled by the specification, my job is 
>harder. I don't *care* if that opens us up for sneering. Let 
>whomever wants to sneer, sneer and jeer and leer away.
>
>>(See for example 
>>http://www.shirky.com/writings/semantic_syllogism.html for a 
>>typical attitude,
>
>Too bad that that article is a complete and total joke. I do not 
>mean the attitude, I mean the content.

But look, this is the audience we have to communicate with. I'm not 
endorsing the content, only using it as an illustration of the 
attitudes that we will find out there in the heads of the people who 
will be implementing some of this stuff. (Ive had several 'right on!' 
emails in response to that posting, by the way.) If you use terms 
like 'entailment' and them tell them that there are six (so far, and 
counting) distinct kinds of entailment defined, that they have to get 
right and/or choose between, how many converts will you get to the 
great SWeb cause? We are in danger of creating a logician's 
mini-paradise here that will simply get ignored by the rest of the 
planet.

>>or http://www.disobey.com/detergent/2002/sw123/ for a more 
>>enthusiastic one. Both among the first 20 hits on 'semantic web' .)
>
>Er...yes? This is "I like RDF and dislike RDF/XML and had trouble 
>getting into RDF because of RDF/XML". That's fine. I don't see it as 
>anti-more expressiveness. Oh I see. He doesn't mention inference et 
>al. So?

So, do you want (folk like) him to be able to read the spec and 
implement a simple SPARQL processor? Which do you think is going to 
make most sense to him: to read about substitutions for variables and 
subsets of triples, or to read about entailment, with references to 
two documents describing six model theories, three of which are 
described in a style so dense and opaquely written that at least two 
professional academic logicians I know have given it up as unreadable?

>In either case, neither of those people pay me or fund my 
>organization. The ones that do are interested in inference, often to 
>Jim's dismay ;) (He got over it...you can too!)
>
>>>). A virtual graph approach, suitably described, might work as 
>>>well for these audience. But the current document is *not* 
>>>adequate on this front.
>>
>>I wish to know why, and in what regard, it is inadequate.
>
>For a set of people who are reasonably to be considered both 
>interested and capable (in a general way) and who are likely to 
>implemented interesting query languages, the current spec produces a 
>lot of confusion. Enrico is a good example. I'm another (I had to 
>have a conversation or two with you!) Evren Sirin is another (I just 
>explained the virtual graph bit to him).
>
>>All the objections I have read so far have been to the effect that 
>>there MUST be a semantic story which makes querying into a species 
>>of entailment-checking:
>
>That's not my point. My point if you are going to *not* do that, it 
>would be helpful to be *clearer* about how the current approach 
>works and how it is, or is not, going to apply to more expressive 
>languages.

What does it mean to apply to a more expressive language? It will 
apply as it stands to any - read my lips, *any* - RDF graph. If your 
language is encoded in RDF, and you know how the encoding works, then 
you can use that knowledge plus SPARQL to get you somewhere; but 
SPARQL isn't going to do that work for you, or magically extend 
itself to be a general-purpose logical inference protocol. What else 
did you expect, for goodness' sake?

Pat
-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 20 September 2005 20:55:39 UTC