Re: RDF Semantics / error in RDFS entailment lemma from Pat Hayes on 2005-10-28 (www-rdf-comments@w3.org from October to December 2005)

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 28 Oct 2005 14:07:15 -0500
To: Herman ter Horst <herman.ter.horst@philips.com>
Cc: www-rdf-comments@w3.org
Message-Id: <p06230914bf88134bf1df@[10.100.0.16]>
>Following the period during which I reviewed preliminary
>versions of the RDF Semantics spec (on behalf of the
>WebOnt WG) I investigated various aspects of the RDF
>semantics, which led to two papers [1] [2] that contain
>information that can be viewed as complementing the spec [3].
>In this message I include several comments relating to [3],
>derived from [1] [2].
>In particular I discovered an error that remains
>in the RDFS entailment lemma and that was missed earlier.

Thank you for discovering this error. I have a 
comment on it, and on your proposed solution, 
below. This also has some bearing on your other 
comments.

>===
>
>A) In [3] decidability and complexity of RDFS entailment are
>not considered.  The completeness proof in [3] does not prove
>decidability of RDFS entailment because the RDFS closure graphs
>used are infinite.
>In [2] it is proved that these infinite RDFS closure graphs
>can be replaced by finite, partial RDFS closure graphs that
>can be computed in polynomial time.
>  This is used in [2]
>to prove that RDFS entailment is decidable, NP-complete, and
>in P if the target RDF graph has no blank nodes.
>
>More specifically, a decision procedure for RDFS entailment
>can be described as follows.
>
>Given: a finite set S of finite RDF graphs, a finite RDF graph G:
>- form a merge M(S) of S
>- add to M(S) all RDF, RDFS axiomatic triples, except those
>including a container membership property rdf:_i
>(this is a finite set of axiomatic triples)
>- for each container membership property rdf:_i that appears
>in either S or G, add the four (RDF, RDFS) axiomatic triples
>that include rdf:_i;  if S and G do not include a container
>membership property, then add the four axiomatic triples
>for just one container membership property
>- apply rule lg to any triple containing a literal, in such
>a way that distinct, well-formed XML literals with the same
>value are associated with the same surrogate blank node
>- apply rules rdf2 and rdfs1 to each triple forming a
>well-formed XML literal or a plain literal, respectively
>- apply rule rdf1, rule gl and the remaining RDFS entailment
>rules until the graph is unchanged
>
>In [2] I call the graph H that is now obtained a partial
>RDFS closure: it is finite (see [2] for the proof).
>
>It can be decided whether S RDFS-entails G by checking whether
>H contains an instance of G as a subset or whether H
>contains an XML-clash (see [2] for the proof).
>
>In this decision procedure, the entailment rules are interpreted
>in a broader way than in [3]: blank nodes are allowed
>in predicate position.  See the following comment.
>
>===
>
>B)  There is an error in the RDFS entailment lemma [3].
>For example, the three RDF triples
>
>p subPropertyOf b .
>b domain u .
>v p w .
>
>(where b is a blank node) RDFS-entail the triple
>
>v type u .
>
>This RDFS-entailment cannot be found with the axiomatic
>triples and entailment rules for RDFS.
>These entailment rules are therefore not complete.

Indeed.

>
>In [2] it is proved that this problem can be solved by extending
>RDF graphs to allow blank nodes in predicate position.
>The origin of the problem seems to be a 'mismatch' between
>the notion of RDF graph and the semantics of RDF.
>The semantics allows blank nodes to refer to properties,
>while RDF graphs do not allow blank nodes in predicate
>position.
>It seems that if blank nodes would be allowed in predicate
>position in RDF graphs, then the abstract syntax (i.e.
>RDF graphs) would fit more closely with the semantics.

Agreed.

There is a similar 'mismatch', noted earlier by 
others, in that RDF syntax does not allow 
literals to occur as subjects, even though an RDF 
subject node may denote a literal value. That 
particular mismatch is what gave rise to the need 
for the lg and gl rules in the semantics 
document, as you know.

I believe that all these problems would be 
eliminated, and the rule sets simplified, if we 
were to define the closure rules on an extended 
RDF syntax consisting of e-graphs, in which both 
these restrictions were removed, i.e. sets of 
e-triples, where an e-triple is <s,p,o> with s 
and o both a URIref, literal or bnode, and p a 
URIref or bnode. (It would be harmless to allow p 
to be a literal, but it is not necessary.) There 
would then be no need for the existential 
generalization rules lg and gl to be used when 
defining a closure, and all consideration of 
existential generalization can be restricted to 
checking simple entailment.

These rules are needed at present only to provide 
a way to map from triples with literal objects:

a b "l" .

to put a 'corresponding' bnode in a subject position

_:l type Literal .
a b _:l .

In e-graphs, this can be done directly:

"l" type Literal .

An XML clash is then simply an assertion of the form

"l"^^XMLLiteral type Literal .

where 'l' is not a legal XML string.

Note that defining the closure rules on this 
extended syntax does not itself require extending 
RDF: the e-graph syntax could be used only 
internally to an inference engine, and theorems 
or deductions filtered to only allow correct RDF 
syntax, or modified by the addition of 
bnode-subject triples to express suitable 
conclusions. However, this generalization of the 
RDF syntax model does seem to be a sensible 
proposal, on independent grounds.

BTW, allow me to express my wholehearted 
endorsement of your pD* semantic style for OWL, 
described in [2], which seems to retain the 
useful parts of the language, removes a number of 
highly unintuitive entailments, and greatly 
improves efficiency. This represents the way that 
OWL should have been defined as an extension of 
RDFS, in my view, and is a sounder basis for 
future work than the current OWL-DL spec.

Pat Hayes

>
>[1] H. ter Horst, Extending the RDFS Entailment Lemma,
>Proceedings 3rd International Semantic Web Conference (ISWC2004),
>Springer LNCS 3298, pp. 77-91.
>
>The following is a revised and extended version of [1]
>
>[2] H. ter Horst, Completeness, decidability and complexity of
>entailment for RDF Schema and a semantic extension involving the
>OWL vocabulary, Journal of Web Semantics 3 (2005) pp. 79-115.
>http://www.sciencedirect.com/science?_ob=IssueURL&_tockey=%23TOC%2312925%232005%23999969997%23606589%23FLA%23&_auth=y&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=814973d32693889bd7e4cf071c815f01
>
>[3] http://www.w3.org/TR/rdf-mt/


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 28 October 2005 19:07:29 UTC