Re: RDF Semantics: rdfs entailment lemma / terHorst-proof-review from herman.ter.horst@philips.com on 2003-11-20 (www-rdf-comments@w3.org from October to December 2003)

From: <herman.ter.horst@philips.com>
Date: Thu, 20 Nov 2003 12:14:15 +0100
To: pat hayes <phayes@ihmc.us>
Cc: Brian_McBride <bwm@hplb.hpl.hp.com>, Dan Connolly <connolly@w3.org>, www-rdf-comments@w3.org
Message-ID: <OFAD64807A.B42A2E91-ONC1256DE4.003D15BD-C1256DE4.003DCA59@diamond.philips.com>

(I resend the mail that I sent yesterday, since it
was probably not clear that in the example
I meant Class with C.  In addition, I make two
small additions, marked with (A).)

When I went again over the proof of the rdfs entailment
lemma, to double-check all the details, I noticed that
there still seems to be an error in the lemma.
Perhaps I didn't see this earlier because the location
of the new error is extremely close to another, more
conspicuous error that I noticed last week and that
was solved with a change to the treatment of literals [1].

The new error also involves literals.
It can be described in terms of a testcase,
and there is a clear solution to the problem.

Testcase:
graph G containing only the triples
  v p l
  p range Class
where l is a literal.
Triple T:
  b subClassOf l
where b is a blank node.

This gives a counterexample to the rdfs entailment lemma.
According to the semantics, G rdfs-entails T.
   (Proof: Let I be an arbitrary rdfs-interpretation that 
   satisfies G.  Then <I(v),I(l)> in IE(I(p))
   and <I(p),I(Class)> in IE(I(range)).  By the semantic 
   condition on range, we get I(l) in CEXT(I(Class))=IC.
   Since IEXT(I(subClassOf)) is reflexive on IC, it follows
   that <I(l),I(l)> in IEXT(I(subClassOf)).
   Therefore, I satisfies T.)
However, the rdfs entailment rules do not derive T from G.
   (ls derives        v p b    -blank node b allocated to l
    rdfs3 derives     b t Class
    rdfs10 derives    b subClassOf b
    Here it stops.)

Proposed solution:
add new RDFS entailment rule:
"rdfs14" if E contains
   uuu aaa _:nnn where _:nnn is allocated to lll by lg
then add:
   uuu aaa lll
In the construction of the RDFS closure, rule "rdfs14" 
should be freely applicable at all times.

This new rule would be a kind of counterpart to rule lg.
It would also solve the problem with the proof.
The problem with the proof is the following step, near
the end:
   given "D contains a triple sur(SH+A(s)) p sur(SH+A(o))"
   conclude "this is an instance of" the triple spo
   "under the instantiation mapping x ->sur(A(x))".
The problem with this conclusion is that if o is a literal
and if b is the blank node allocated to o by rule lg,
then  sur(SH(o)) is b instead of o.
So this is *not* an instance in this case.  Calling the
instantiation mapping M, it can only be concluded that
the triple
   M(s) p b 
is in D.  When rule "rdfs14" would be present, we have
   M(s) p o
in D, as desired.

I confirm that I consider the rdfs entailment lemma to be
proved when this change would be made.

(A) The validity of "rdfs14" is clear.  There is no need
to make such an addition for the rdf entailment lemma.

===

I am afraid that with the comment above I express the
need to re-open the LC2 topic 'terHorst-proof-review'.
I hope that this error could be corrected before the 
PR publication.  Fortunately, it could easily be 
handled with the indicated solution.

===

I should also mention that the change to the treatment of
literals that was introduced last week seems to have made 
the Skolemization lemma false.

Since plain literals now also count as vocabulary, and
since plain literals cannot be "freely interpreted",
the interpretation I' cannot be constructed in general
in the way indicated in the proof.

There is a straightforward solution to this problem, by
sharpening the definition of Skolemization, for example
in the following way:
A Skolemization of an RDF graph G is an instance of
G with respect to an 1-1 instance mapping that maps
each blank node in G to a URI reference that does not appear
in G.
(A) (I omitted the word ground before instance since 
this is implied by the rest of the definition.)

The Skolemization lemma plays only a peripheral role
in the RDF Semantics document, since it is not on the
path to the main completeness results.
Still, the lemma is relevant since, as the document notes
"Skolemization is routinely used in automatic inference
systems".


Herman ter Horst

[1] http://lists.w3.org/Archives/Public/www-rdf-comments/2003OctDec/0161.html

Received on Thursday, 20 November 2003 06:15:05 UTC