Re: RDF Semantics: a partial review from pat hayes on 2003-10-17 (www-rdf-comments@w3.org from October to December 2003)

From: pat hayes <phayes@ihmc.us>
Date: Fri, 17 Oct 2003 18:20:29 -0400
To: herman.ter.horst@philips.com
Cc: www-rdf-comments@w3.org
Message-Id: <p06001f09bbb593f2701c@[192.168.0.38]>
>Sorry - I sent this mail to rdfcore instead of rdfcomments.
>Resending it below, I make a few additions, marked by (A)
>and based on further reading in the document.
>
>
>Partial review of yesterday's version of the RDF Semantics
>document at
>   http://www.ihmc.us/users/phayes/RDF_Semantics_LC2.4.html
>
>Although I have not yet looked in detail at the RDFS entailment
>lemma proof, it seems safe to say that the new proof appendix
>is an improvement.
>(A) It is very nice that there is now such a short proof of
>the interpolation lemma, that the proofs of the rdf and
>rdfs entailment lemmas follow a more direct pattern than
>before, and that more details are provided for the rdf and
>rdfs entailment lemmas.
>
>==
>
>The current definition of simple entailment reads:
>>a set S of RDF graphs (simply) entails a graph E if every
>>interpretation [of the vocabulary of (S union E)] which
>>satisfies every member of S also satisfies E
>(and similarly for rdf and rdfs entailment).
>The part between [] is added in the LC2 version.
>Without the part between [], transitivity of entailment
>is trivial: if S entails H entails G, then S entails G.
>
>I assume that each type of entailment (simple, rdf, rdfs)
>is still intended to be transitive.

That would certainly be best. See below.

>This is not made explicit in the document,
>and there seem to be several complications in the proofs
>that result from the addition between [], and which are
>not noticed in the text.

<snip>

>Wouldn't it be much simpler to go back to the LC definition of
>entailment, i.e. without the addition between []?

Yes, it would.

The history of the reasons for having this rather odd vocabulary 
restriction are complicated, but essentially have to do with the 
desire to keep the 'closure' of a finite graph finite, so that a 
simple rule engine can work by applying the rules in a forward 
direction to exhaustion. However, this is really now no longer of 
interest since the closures are only defined as mathematical 
abstractions in the completeness proofs.

So I will make the following changes:

remove the vocabulary condition on entailment, as you suggest, ie the 
[ ] words (in four places);

add the word 'finite' to the statements of the entailment lemmas, so 
that they read: "...iff there is a finite graph which can be 
derived...."

in the proofs, remove the words [which contain vocabulary from 
(vocab(S) union vocab(E) union crdfV)] from the definition of rdf(s) 
-closure., and the bracketted comment about vocabularies.

and add the following to the end of the proof:

'....so a subgraph of C is an instance of E, so C simply entails E. By the
compactness lemma, there is a finite subgraph of C which simply 
entails E and therefore satisfies the conditions of the lemma.  QED "

and similarly for the rdfs lemma (the wording is more delicate there).


>Moreover, in order to prevent complicated interactions with
>axiomatic triples and their vocabulary, at several points the
>proofs seem to become simpler when each rdf-interpretation is
>assumed to interpret all of rdfV (and similarly for rdfs-interpretations).
>(A) I should explain this further.  When the RDF(S) closure
>of a graph G is constructed by first adding *all* RDF(S) axiomatic
>triples, and then by applying the entailment rules, then
>the analogue of the above lemma for RDF(S) entailment
>is immediately obtained from the RDF(S) Herbrand interpretations,
>assuming that the part between [] is omitted from the
>definition of entailment
>(the entailment rules do not introduce further vocabulary):
>Lemma. If H RDF(S) entails G, then
>        Voc(G) subsetOf Voc(H) u rdfV ( u rdfsV ).
>RDF(S) entailment is transitive: statement similar to that
>above.
>(For rdfs-entailment, the assumption that the RDFS closure
>of G does not contain an XML clash should be included.)

Since the entailment definitions now make no explicit reference to 
vocabularies, transitivity is trivial, so I do not see why there is 
any need to state or prove any further lemmas here.  The lemma only 
says that a graph is derivable by the rules which entails E, so if 
the graph were to contain any extra vocabulary then the entailment 
from that graph would encompass that extra vocabulary by definition 
of 'entail'.

Am I missing something?

>==
>
>Near the end of the proof of the RDF entailment lemma, it
>seems that a conclusion is not justified:
>
>Given a mapping A from blank(E) to IRRH such that
>RH+A satisfies every triple spo in E, it does
>not follow that <RH+A(s)',RH+A(o)'> is in IERH(p).
>It does follow that <RH+A(s),RH+A(o)> is in IERH(p).
>To conclude the proof, a function A':blank(E)->IRRH
>can be defined by A'(b)=b' if A(b)=xml(l) for some
>well-typed XML literal l, where the blank node b'
>is allocated to l, and by A'(b)=A(b) otherwise.

That is the mapping I intended to convey. Perhaps the prime notation 
is ambiguous. Suppose it read:

.... [RH+A] satisfies every triple
s p o .
in E, i.e. IEXTRH(p) contains <([RH+A](s))',([RH+A](o))'> ....

would that be clearer? I have made this change, anyway.

>Then, for each triple spo in E, I believe that by
>considering the possible cases for s and o (i.e.,
>whether they are URI references, well-typed
>XML literals or other literals, or allocated or
>non-allocated blank nodes)
>it follows that
><RH+A'(s)',RH+A'(o)'> is in IERH(p).
>This gives the desired subgraph of C that is an instance
>of E.
>
>==
>
>I do not see the reason to make use of the literal generalization
>rule.
>The proof of the rdf-entailment lemma uses rule lg only, in the
>construction of C, for triples with well-typed XML literals
>in object position, and for this case the same newly generated triple
>can also be obtained with rule rdf2, which would then generate two
>triples.
>Of course, the if side of the lemma doesn't need it either.
>Or am I overlooking something?

No, you are correct. The only cases of rule lg that are needed in the 
RDF proof are those which support the rule rdf2. I have phrased the 
rules this way as it seemed more elegant to have rules with a single 
conclusion; and the RDFS case requires all literal generalizations to 
be done, as even ill-typed literals must be mappable into subject 
position in order to exposed.

>(A) I now see that the RDFS entailment lemma uses lg in a way
>not subsumed by other rules.  It seems that the RDF
>entailment lemma could omit the use of lg.
>
>==
>
>In the proof of the RDF entailment lemma, it would be appropriate
>to say explicitly what the domain and range of the mapping
>x |-> x' is:
>nodes(C) respectively IR.

True, I will insert appropriate text.

>It seems that the last case of this definition ("if x is
>allocated to any other literal lll by rule lg then x' is
>lll itself") could be omitted, since lg (or rdf2, see previous
>remark) is only applied to triples with well-typed XML literals
>in object position.

True. That was left-over from a previous edit. I will remove it.

>
>==
>
>In the definition of the RDF Herbrand interpretation RH,
>in the definition of IEXTRH(x), it seems that the 'otherwise'
>case can be omitted, as well as the text between brackets
>that follows this table, since the domain of IEXTRH is
>IPRH.  In other words, there is no 'otherwise'.

Ah, yes, of course. That was left over from an earlier version where 
this condition was needed. I will remove this.

>==
>
>It seems that in the definition of RH, the set IR needs to
>include only the URI references in node position.

True, but the current definition is simpler and is harmless.

>==
>
>The table in Section 4 with the RDFS semantic conditions seems
>to have an error:
>If x is in IC, then it should be
><x,I(rdfs:Resource)> is in IEXT(I(rdfs:subClassOf))
>instead of
><x,IR> is in IEXT(I(rdfs:subClassOf))

True, thanks for noticing that. In fact Graham Klyne had already 
noted it but I neglected to make the correction.

>==
>
>The proof of the interpolation lemma uses a subscript Herb(G)
>which should be Herb(S).
>
>==
>
>Proof of rdf entailment lemma: instead of
>"RDF subinterpretation", it should be RDF interpretation.
>

Thanks, now fixed.

The corrections can be viewed in the version at

  http://www.ihmc.us/users/phayes/RDF_Semantics_LC2.5.html

Does this adequately respond to your comments?

Pat

-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 17 October 2003 18:19:05 UTC