Peter's objection to the RDF(S) rules from pat hayes on 2003-10-07 (w3c-rdfcore-wg@w3.org from October 2003)

From: pat hayes <phayes@ihmc.us>
Date: Mon, 6 Oct 2003 19:03:47 -0500
To: w3c-rdfcore-wg@w3.org
Cc: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Message-Id: <p06001f2bbba7a5305c52@[10.0.100.25]>
This is an attempt to give a brief explanatory account of the issue, 
subsequent to the LC comments pfps-04 and pfps-05, that Peter and I 
have been disagreeing about. This explanation has not been reviewed 
by Peter, who may wish to comment further.

First, some things it isn't. Peter (and others) found many bugs in 
the details of the entailment rules (formerly called closure rules) 
and some entire areas of the semantics that that failed to capture; 
all these details have now been dealt with, either by strengthening 
or modifying the rules, or by changing the semantics (and in the case 
of XML literals, also the syntax) of RDF itself. So what remains 
isn't a case of 'there's a bug in the rules'; the completeness 
results in the text are now, I believe, pretty thoroughly checked. 
The issue that remains is more like a difference of opinion about the 
best way to present *any* system of rules for RDF and RDFS.

The two styles of presentation an be characterized as a 'classical' 
rule set vs. a 'reduction' rule set. In the classical style, a system 
of rules is given which is capable of generating all the valid 
entailments form a set of assumptions. It is easy to characterize 
this abstractly: S entails E if and only if you can derive E from S 
by means of the rules: derivability exactly mirrors entailment. 
(This is sometimes called 'strong completeness'.) If you want to 
check simple entailment, use rules for simple entailment; if you want 
to check RDF entailment, use the RDF rules, and so on.  That is the 
style that Peter likes.  The other style assumes that the rules work 
by transforming one problem into a different problem: in our case, 
the RDF(S) rules transform RDF(S)-entailment into simple entailment, 
so that  S rdf-entails E iff you can derive an S' from S by using the 
rdf rules such that S' simply entails E; but we don't give rules for 
simple entailment itself. This is the style that is used in the 
document, that Peter does not like.

The reason why I used the reduction style rather than the classical 
style is because I wanted the rules to be at least suggestive of how 
one might implement a simple entailment-checker, and a strongly 
complete set of rules for simple entailment is hopelessly 
inefficient. The problem is the bnodes. You can take a single triple 
and validly conclude a huge graph from it consisting of nothing but 
bnode-generalizations of it. Of course this isn't a very useful thing 
to do, but any system of rules which claims to be strongly complete 
in the classical sense has to be able to draw those kind of 
conclusions, so you have rules which, if actually run in the forward 
direction, would run on pointlessly generating new generalizations 
for ever.  One moral to draw, of course, is to not even try to run 
rules in the forward direction: but in fact, if we just keep 
bnode-generalization in check a little, the resulting rule sets *can* 
be run forwards, and will eventually terminate, and still are 
complete in the second, 'reduction' sense; and there is extant 
software based on this idea which (I admit, to my surprise) seems to 
work reasonably well on medium-sized RDF and RDFS graphs. (I had 
originally thought that I could avoid bnode-generalization 
altogether, but that turned out not to be possible since one has to 
allow generalizations on terms which are introduced by other rules: 
that  was a rule-set-bug that was fixed a while ago.) The general 
task of being complete when checking simple entailment (ie checking 
the validity of bnode-generalization between arbitrary graphs) is 
another kind of computational issue, since it is NP-complete, as 
Jeremy has noticed. On the other hand, does one really need to be 
complete in this sense, in  order to be useful? I suspect not, in 
practice.

The actual differences between the rule sets are apparently very 
small (to see them, contrast
http://www.ihmc.us/users/phayes/RDF_Semantics_LC2.1_NFC.html#simpleRules
with
http://www.ihmc.us/users/phayes/RDF_Semantics_LC2.1_NFC.html#interplemmaprf
paying close attention to the notational conventions described at
http://www.ihmc.us/users/phayes/RDF_Semantics_LC2.1_NFC.html#rules  )
but the differences in usability by forward-searching software are 
considerable.

One other procedural point: the 'reduction' style of presenting the 
rules has been used consistently in all versions of the RDF semantics 
document, but was never explicitly addressed by any comments until 
well after the closure of the first last-call period.  Peter and I 
disagree about this.  Peter's view, as I understand it, is that since 
his LC comments pfps-04 and -05 refer to a lack of 'completeness' in 
the rules, and since the difference between the styles of 
presentation can be characterized in terms of strong completeness, 
that this issue can be understood as implicit in those comments. I do 
not find this argument compelling, but reproduce it here for the sake 
of completeness. I would however note that while logic textbooks 
often give strongly complete set of inference rules, the word 
'complete' has not been used in this sense for about 30 years in the 
ATP or logic programming communities: binary resolution, for example, 
is not strongly complete.

Pat Hayes


-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 6 October 2003 20:03:58 UTC