- From: pat hayes <phayes@ihmc.us>
- Date: Thu, 6 Nov 2003 15:57:30 -0600
- To: herman.ter.horst@philips.com
- Cc: www-rdf-comments@w3.org
>This is a review of part of the RDF Semantics
>document, editorial version LC2.5.
>In this message I mainly focus on the rdfs entailment lemma.
>
>The proof of this lemma is based on the claim that
>the RDFS Herbrand interpretation of an RDF graph is an
>RDFS interpretation.
>This claim seems to be false: the first condition for
>RDF interpretations is not satisfied.
>In order to show this, note that this condition amounts,
>in this case, to the equivalence
> v in IP iff <v,Property> in IEXT(type).
>Suppose that the graph G has triples
> v p l
>and
> p range Property
>where l is a plain literal.
>By rule lg, the RDFS closure D of G contains the triple
> v p b
>where b is allocated to l and where b = sur(l).
>By rule rdfs3, D contains the triple
> b type Property = sur(l) type Property.
>Therefore, <l,Property> in IEXT(type).
>However, we cannot have l in IP, since that would mean
>that D contains the triple
> l type Property.
Yes. The SH definition of IP should refer to the surrogate, just as
the IEXT definition does, ie should read:
IP<SH> = {x: D contains a triple sur(x) rdf:type rdf:Property . }
then l can (indeed will) be in IP. that is, all the semantic
conditions should be 'read off' from the graph via the surrogates.
I will make this change.
>===
>
>It should be made explicit what the domain and range of the
>function sur are: I assume that these sets are both IR.
The domain is IR and the range is a subset of IR consisting of
vocabulary items. So the range is a subset of IR.
>When this assumption is made explicit, there seems to be a
>circularity in the definition of LV for the RDFS Herbrand
>interpretation:
>the definition of LV depends on sur, the definition of sur
>depends on IR, the definition of IR depends on LV.
>In view of this circularity, the definition of LV becomes
>incomprehensible. I believe that the definition of LV should
be made explicit.
OK, though I don't accept that it is ambiguous at present.
>From the given definition, I would guess that the intention
>is that LV is the union of five sets:
> strings
> pairs of strings and language tags
ie plain literals, yes...
> XML values of well-typed XML literals in D
> {v in voc(D): the triple v type Literal is in D }
> {v in voc(D): v a typed, non-XML literal such that
> b type Literal is in D, where b is the blank node allocated
> to v by rule lg }
I do not think that this way of phrasing it appropriate. The central
intuition is that in a simple Herbrand interpretation, IR consists of
the vocabulary items (including bnodes) in the graph, and the
interpretation is simply read off the graph. Here we need to modify
this by treating some vocabulary items as surrogates for more the
special values required by the semantic conditions, and adding
required items (all plain literals) which may not be in the graph;
otherwise, the construction should mirror the simple Herbrand
construction.
I propose to rephrase the definition as follows, modeled on the
definition used in the RDF lemma:
-------
If lll is a well-formed XML literal, let xml(lll) be the XML value of
lll; and for each XML value xml(lll) of any well-formed XML literal
lll in D, let sur(xml(lll)) be the blank node allocated to lll by
rule lg; for any other literal lll in D, let sur(lll) be the blank
node allocated to lll by rule lg, and extend sur to be the identity
mapping on URI references and blank nodes in D. The domain of this
mapping is the universe IR<SH>, defined below, and the range contains
only URI references and blank nodes which occur in D.
-------
>===
>
>"Define B(x) as before, then clearly [SH+B] satisfies D ..."
>There seems to be a problem with this conclusion.
>Making this explicit, it seems that B:blank(D)->IR
>needs to be defined by
>B(v)=xml(l) if v is a blank node allocated to the well-formed
>XML literal l,
>B(v)=l if v is a blank node allocated to a typed, non-XML
>literal l,
>otherwise B(v)=v.
The wording is careless, forgive me. The intention was that the
second case would include ALL other literals, ie non-well-typed XML,
other typed and plain, if v has been allocated to that literal. I
will spell this out more carefully:
-----
Define B(x) as follows: if x is a blank node allocated to a
well-formed XML literal lll in D then B(x) = xml(lll); if it is
allocated to any other literal lll in D then B(x)=lll; and otherwise
B(x)=x.
-----
I am not sure if this point solves your next comment, because I am
not sure what the force of the comment is.
>(The second case is not exactly as before, but seems to be
>needed to develop a complete proof of the condition
>LV = ICEXT(Literal).)
>
>Given a triple vpw in D, rule rdf1 shows that D contains the
>triple
> p type Property
>so that p in IP. In order to prove that SH+B satisfies vpw,
>i.e. that <SH+B(v),SH+B(w)> in IE(p), it is sufficient to
>prove that D contains the triple
> * sur(SH+B(v) p sur(SH+B(w)).
>Note that
> sur(SH+B(v)) = v (when v in nodes(D) - literals)
> sur(SH+B(v)) = b (when b is the blank node allocated to
> v in nodes(D) intersection literals)
>(this can be checked for each of many different cases).
>So it can be concluded that D contains the triple * when
>lg can be applied in each step of the construction of D.
>However, rule lg is only used as the first step.
What is the problem? Once the surrogate blank node has been
introduced, all the entailment rules apply to triples containing it
whenever they would have applied to the similar triple containing the
literal (and of course to some new triples which would have been
illegal using the literal in subject position). So if any triple
containing a literal is in D, then so is the similar triple
containing its surrogate. So, in effect, once the surrogate has been
introduced and the rule applied in every possible way once (so as to
reproduce the entire sub-graph of all triples which contain that
literal, with the literal replaced by the surrogate) , the literals
can in effect be ignored completely, and the closure can proceed
using only blank nodes and URI references. Provided we then take care
to then map allocated blank nodes back to their appropriate literal
values, and all other blank nodes to themselves, everything works out
fine.
Your point would be well taken if the were rules which introduced
new literals which did not occur in the original graph, but there are
no such rules.
>It seems that this problem would be solved when rule lg can
>always be used in the construction of D
>
>===
>
>There seem to be problems with the proof of the condition
>IR = ICEXT(Resource).
>It only needs to be proved that if x is in IR, then
><x,Resource> in IE(type>, as the opposite is trivial.
>(Note that the document states the opposite.
Good point, I will change that.
>Note also that for the proof of LV=ICEXT(Literal), the
>document only states an if statement instead of an two
>statements.)
I will change that also. The case of interest, again, is the 'if'
case, since the other case is trivial by construction.
>However, there are many cases. The proof is not clear.
Can you say which parts are unclear? I believe that the table covers
all the cases that are syntactically possible: URIs in subject,
predicate and object position: bnodes in subject and object, and
literals in object (2 cases). I have rephrased the entires in the
table slightly so as to make the connection with the notation used in
the proof more evident.
>
>It seems that the proof uses and needs to use the triple
>** Literal subClassOf Resource,
>which however is not an axiomatic triple, to my surprise.
>Shouldn't this triple ** be made into an axiomatic triple?
It follows from the use of Literal as a range, so its being a Class,
so its being a subClass of Resource by rdfs8. The derivation is
given in the table (first row, second sub-row)
>
>The last lines of the four proof parts consist of the
>triple
> x type Resource
>If x is a URI this suffices to prove that
><x,Resource> in IE(type>, however when x is a blank node
>or a literal this is not sufficient.
>
It cannot be a literal, in subject position. Why is it not sufficient
for a blank node? The argument is exactly similar to the standard
case for a simple Herbrand interpretation: the surrogate for a blank
node is itself. (Its potential role as itself being a surrogate for a
literal is irrelevant at this point.)
Pat
PS. the changes made described above are now visible in the copy on my website:
http://www.ihmc.us/users/phayes/RDF_Semantics_LC2.5.html
I have also added an explanatory paragraph just before the proof of
the RDF entailment lemma, and some explanatory prose in the proof of
the RDFS entailment lemma concerning the role of literal surrogates.
Please let me know if this response adequately deals with the issues you raise.
Pat
>
>
>Herman ter Horst
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32501 (850)291 0667 cell
phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Thursday, 6 November 2003 16:57:33 UTC