- From: pat hayes <phayes@ihmc.us>
- Date: Thu, 6 Nov 2003 15:57:30 -0600
- To: herman.ter.horst@philips.com
- Cc: www-rdf-comments@w3.org
>This is a review of part of the RDF Semantics >document, editorial version LC2.5. >In this message I mainly focus on the rdfs entailment lemma. > >The proof of this lemma is based on the claim that >the RDFS Herbrand interpretation of an RDF graph is an >RDFS interpretation. >This claim seems to be false: the first condition for >RDF interpretations is not satisfied. >In order to show this, note that this condition amounts, >in this case, to the equivalence > v in IP iff <v,Property> in IEXT(type). >Suppose that the graph G has triples > v p l >and > p range Property >where l is a plain literal. >By rule lg, the RDFS closure D of G contains the triple > v p b >where b is allocated to l and where b = sur(l). >By rule rdfs3, D contains the triple > b type Property = sur(l) type Property. >Therefore, <l,Property> in IEXT(type). >However, we cannot have l in IP, since that would mean >that D contains the triple > l type Property. Yes. The SH definition of IP should refer to the surrogate, just as the IEXT definition does, ie should read: IP<SH> = {x: D contains a triple sur(x) rdf:type rdf:Property . } then l can (indeed will) be in IP. that is, all the semantic conditions should be 'read off' from the graph via the surrogates. I will make this change. >=== > >It should be made explicit what the domain and range of the >function sur are: I assume that these sets are both IR. The domain is IR and the range is a subset of IR consisting of vocabulary items. So the range is a subset of IR. >When this assumption is made explicit, there seems to be a >circularity in the definition of LV for the RDFS Herbrand >interpretation: >the definition of LV depends on sur, the definition of sur >depends on IR, the definition of IR depends on LV. >In view of this circularity, the definition of LV becomes >incomprehensible. I believe that the definition of LV should be made explicit. OK, though I don't accept that it is ambiguous at present. >From the given definition, I would guess that the intention >is that LV is the union of five sets: > strings > pairs of strings and language tags ie plain literals, yes... > XML values of well-typed XML literals in D > {v in voc(D): the triple v type Literal is in D } > {v in voc(D): v a typed, non-XML literal such that > b type Literal is in D, where b is the blank node allocated > to v by rule lg } I do not think that this way of phrasing it appropriate. The central intuition is that in a simple Herbrand interpretation, IR consists of the vocabulary items (including bnodes) in the graph, and the interpretation is simply read off the graph. Here we need to modify this by treating some vocabulary items as surrogates for more the special values required by the semantic conditions, and adding required items (all plain literals) which may not be in the graph; otherwise, the construction should mirror the simple Herbrand construction. I propose to rephrase the definition as follows, modeled on the definition used in the RDF lemma: ------- If lll is a well-formed XML literal, let xml(lll) be the XML value of lll; and for each XML value xml(lll) of any well-formed XML literal lll in D, let sur(xml(lll)) be the blank node allocated to lll by rule lg; for any other literal lll in D, let sur(lll) be the blank node allocated to lll by rule lg, and extend sur to be the identity mapping on URI references and blank nodes in D. The domain of this mapping is the universe IR<SH>, defined below, and the range contains only URI references and blank nodes which occur in D. ------- >=== > >"Define B(x) as before, then clearly [SH+B] satisfies D ..." >There seems to be a problem with this conclusion. >Making this explicit, it seems that B:blank(D)->IR >needs to be defined by >B(v)=xml(l) if v is a blank node allocated to the well-formed >XML literal l, >B(v)=l if v is a blank node allocated to a typed, non-XML >literal l, >otherwise B(v)=v. The wording is careless, forgive me. The intention was that the second case would include ALL other literals, ie non-well-typed XML, other typed and plain, if v has been allocated to that literal. I will spell this out more carefully: ----- Define B(x) as follows: if x is a blank node allocated to a well-formed XML literal lll in D then B(x) = xml(lll); if it is allocated to any other literal lll in D then B(x)=lll; and otherwise B(x)=x. ----- I am not sure if this point solves your next comment, because I am not sure what the force of the comment is. >(The second case is not exactly as before, but seems to be >needed to develop a complete proof of the condition >LV = ICEXT(Literal).) > >Given a triple vpw in D, rule rdf1 shows that D contains the >triple > p type Property >so that p in IP. In order to prove that SH+B satisfies vpw, >i.e. that <SH+B(v),SH+B(w)> in IE(p), it is sufficient to >prove that D contains the triple > * sur(SH+B(v) p sur(SH+B(w)). >Note that > sur(SH+B(v)) = v (when v in nodes(D) - literals) > sur(SH+B(v)) = b (when b is the blank node allocated to > v in nodes(D) intersection literals) >(this can be checked for each of many different cases). >So it can be concluded that D contains the triple * when >lg can be applied in each step of the construction of D. >However, rule lg is only used as the first step. What is the problem? Once the surrogate blank node has been introduced, all the entailment rules apply to triples containing it whenever they would have applied to the similar triple containing the literal (and of course to some new triples which would have been illegal using the literal in subject position). So if any triple containing a literal is in D, then so is the similar triple containing its surrogate. So, in effect, once the surrogate has been introduced and the rule applied in every possible way once (so as to reproduce the entire sub-graph of all triples which contain that literal, with the literal replaced by the surrogate) , the literals can in effect be ignored completely, and the closure can proceed using only blank nodes and URI references. Provided we then take care to then map allocated blank nodes back to their appropriate literal values, and all other blank nodes to themselves, everything works out fine. Your point would be well taken if the were rules which introduced new literals which did not occur in the original graph, but there are no such rules. >It seems that this problem would be solved when rule lg can >always be used in the construction of D > >=== > >There seem to be problems with the proof of the condition >IR = ICEXT(Resource). >It only needs to be proved that if x is in IR, then ><x,Resource> in IE(type>, as the opposite is trivial. >(Note that the document states the opposite. Good point, I will change that. >Note also that for the proof of LV=ICEXT(Literal), the >document only states an if statement instead of an two >statements.) I will change that also. The case of interest, again, is the 'if' case, since the other case is trivial by construction. >However, there are many cases. The proof is not clear. Can you say which parts are unclear? I believe that the table covers all the cases that are syntactically possible: URIs in subject, predicate and object position: bnodes in subject and object, and literals in object (2 cases). I have rephrased the entires in the table slightly so as to make the connection with the notation used in the proof more evident. > >It seems that the proof uses and needs to use the triple >** Literal subClassOf Resource, >which however is not an axiomatic triple, to my surprise. >Shouldn't this triple ** be made into an axiomatic triple? It follows from the use of Literal as a range, so its being a Class, so its being a subClass of Resource by rdfs8. The derivation is given in the table (first row, second sub-row) > >The last lines of the four proof parts consist of the >triple > x type Resource >If x is a URI this suffices to prove that ><x,Resource> in IE(type>, however when x is a blank node >or a literal this is not sufficient. > It cannot be a literal, in subject position. Why is it not sufficient for a blank node? The argument is exactly similar to the standard case for a simple Herbrand interpretation: the surrogate for a blank node is itself. (Its potential role as itself being a surrogate for a literal is irrelevant at this point.) Pat PS. the changes made described above are now visible in the copy on my website: http://www.ihmc.us/users/phayes/RDF_Semantics_LC2.5.html I have also added an explanatory paragraph just before the proof of the RDF entailment lemma, and some explanatory prose in the proof of the RDFS entailment lemma concerning the role of literal surrogates. Please let me know if this response adequately deals with the issues you raise. Pat > > >Herman ter Horst -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32501 (850)291 0667 cell phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Thursday, 6 November 2003 16:57:33 UTC