Re: subgraph/entailment from Pat Hayes on 2005-09-20 (public-rdf-dawg@w3.org from July to September 2005)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 20 Sep 2005 14:39:02 -0500
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <p06200704bf55f047d775@[10.100.0.9]>
>On 19 Sep 2005, at 23:32, Pat Hayes wrote:
>
>>Most of the people working on these applications will not be using 
>>more expressive languages in any depth, and are not interested in 
>>inference or entailment.
>
>I find this statement very dangerous to say to a semantic web 
>community. Of course, we want efficient systems, but ideally we also 
>want a semantics behind them. Having no interest in semantics or 
>entailment really means that there should be no interest in 
>*semantics* seriously speaking, so why bother with your long RDF-MT 
>document :-)

(Although we have all kind of agreed on things in the recent telecon, 
this point deserves a reply, as it is of larger importance.)

First, I said they were not interested in entailment or inference, 
not that semantics was irrelevant to them. But maybe that seems 
pedantic.

I agree that formal semantics is important and central to the SWeb, 
and that there is a semantics "behind" SW applications. However, I 
think that formal semantics is a tool to be used, not an idol to be 
worshipped. It is important that the languages used for publishing 
and distributing content on the SWeb have a formal semantics, because 
it is that semantics which provides the touchstone for meanings which 
are preserved between the (arbitrarily remotely separated) events of 
publication and access. It is exactly this separation between 
publication - assertion, if you like - and access - the drawing of a 
conclusion - which necessitates the use of a global semantic 
criterion of meaning, and hence the need for a model theory. It is 
far less clear, however, that there is such a need for a query 
language to be closely tied to a model theory, for two reasons: 
first, the entire business of querying and receiving a response is, 
by its very nature, not separated in this way. (The fact that there a 
use case for allowing queries to use bnodes which are scoped to the 
'session' illustrates this closeness, this immediacy of the process 
of querying and responding.) And secondly, in my view it is precisely 
the fact that the assertional languages have a model theory, which 
makes it unnecessary for the query language to be defined in terms of 
it. There is no ambiguity about the meaning of some RDF or RDFS or 
OWL/RDF, after all, that needs to be resolved by a semantic 
specification of the operation of a query language: such a semantic 
specification does not fix meanings which would otherwise be 
indeterminate or unknown. The model theories (and RDF embeddings) of 
the languages themselves provide a complete semantic account, which 
frees up the query language to be more sensitive to syntactic and 
even ad-hoc nuances in the knowledge-bases themselves, and to the 
purpose of querying.  Take as an example the recent threads on the 
topic of whether a SPARQL query should give answers corresponding to 
RDF tautologies, such as

_:x rdf:type rdfs:ContainerMembershipProperty .

If one takes entailment as central, then the answer is clearly yes, 
since this is RDF-entailed by any graph, even the empty graph: end of 
story. But what utility is provided by such a stance? Surely, if you 
are querying a knowledge base in RDF, you already know RDF, so you 
already know RDF tautologies. The whole point of querying is to find 
out what *knowledge* is encoded in the knowledge source being 
queried. Ironically, a tautology provides no knowledge, no 
information at all, precisely by virtue of being a tautology within a 
publicly standard semantics. So it seems to me that to insist that a 
query like

?x rdf:type rdfs:ContainerMembershipProperty .

should give a response when tested against any RDF graph, is not a 
useful stance to take, in practice. In fact, it is so useless that I 
would wager that if we were to incorporate this into the standard 
then it would be widely ignored, and nobody would really care, and 
the SWeb will carry on doing part of the world's business quite 
happily, while technically getting the logic wrong. And I would also 
claim - and this I mean quite seriously - that in the event of this 
occurring, it would be the logicians who were in the wrong. All of 
our logics and semantics must eventually be judged by their pragmatic 
utility, rather then treated as an absolute standard against which 
the world must be judged. This is why I am rather unconvinced, and 
even rather repelled, by arguments along the lines that something 
must be done in a certain way because that way is 'elegant', or 
because it is the 'standard' way, or because to do it this way will 
enable us to do things just like we have always done them before.

I actually think - and I will admit immediately that this is a highly 
personal view - that the SWeb will in fact cause us to extend and 
modify our long-cherished notions of logical meaning, validity of 
inference and so on; in fact that it already has done so. (Datatyped 
literals, for example, are not a typical constituent of classical 
logics.) These notions are all rooted in a logical tradition which 
derives fairly directly from work in the foundations of mathematics, 
and which has almost nothing to do with the issues that arise 
centrally in the semantic web. It has already been extended and 
adapted in such areas as philosophical logic and formal linguistics 
in ways that go well beyond the rather narrow views of what counts as 
'logical' which SWeb discussions have so far used. I am concerned 
that we allow the world to experiment with the SWeb tools, and that 
we logicians should adopt a rather humble stance toward what might 
seem, given our existing methodologies and research guidelines, to be 
inappropriate or 'bad' usages. Almost all of our current 
methodologies for data manipulation, querying, inference and so on 
have been developed in other settings. The semantic web is genuinely 
new, and IMO we have a special responsibility, when defining 
standards, to make sure that we do not impose needless and possibly 
inappropriate restrictions, or 'push' the development of the 
technology in our favorite direction, but rather give the world some 
tools and see what it makes of them. A large part of our task right 
now is simply to give people tools to experiment with. They should be 
tools that are as well designed as we can make them, to be sure: but 
we should expect that people will put them to uses that we have not 
previously considered, and be ready to adapt to that situation, 
rather than try to outlaw it.  (This was the thought behind my use of 
the term "logic police", by the way, which was not intended to be 
mean-spirited: after all, I include myself as one of them. But I 
think we need to be more like those who watch at the edge of the 
demonstration to make sure it doesn't get out of hand, rather than as 
taking on the job of directing the traffic. The fact is, we have no 
better idea of what is going to happen than anyone else does.)

>>If SPARQL gets used for a few years and then a completely different 
>>protocol is developed for 'logical' querying, I will be quite 
>>happy: it will have done its job. If SPARQL is warped or delayed 
>>just to provide it with a logically clean extension path, that 
>>warping or delay is doing far more harm than good, IMO.
>
>Our plan is not to delay or to change what has been already done. It 
>is just about giving a nice logic based semantics to what has been 
>done in a way that also opens new possibilities for well founded 
>extensions.

Well, if you insist. It seems to me that it has this nice logic based 
semantics already: it follows from the interpolation lemma in the RDF 
MT document.

>So, I agree with your concerns, but if we wark together we can have 
>the cake and eat it too!
>
>>For myself, apart from the small group of logic police, whose 
>>objections I can script in advance, *nobody* I have chatted with 
>>has evinced the slightest interest in any formal semantic issues at 
>>all.
>>They tend to regard such matters as arcane academic baloney.
>
>So, why did you give us the nice RDF-MT document? This is a very 
>strong statement. I guess we are working with the same spirit. And, 
>by the way, it is not nice to use terms like "logic police"; I'd 
>like to have more politeness in our discussion.

It was intended to be light-hearted.

>>>). A virtual graph approach, suitably described, might work as 
>>>well for these audience. But the current document is *not* 
>>>adequate on this front.
>>
>>I wish to know why, and in what regard, it is inadequate. All the 
>>objections I have read so far have been to the effect that there 
>>MUST be a semantic story which makes querying into a species of 
>>entailment-checking: and this view is, IMO, both wrong and (in the 
>>SPARQL context) wrong-headed. It is false for SQL, probably the 
>>most widely used query language ever.
>
>See again above my comments on SQL.

Thanks for them, but they do not address my main point, which was 
that the bulk of SQL is concerned with the filtering that happens 
after the basic matching, and that there is no natural way to think 
of this as inference or entailment, and in fact it can be actively 
misleading to do so. I gather (from the telecon) that you agree with 
this point, however.

Pat
-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 20 September 2005 19:41:37 UTC