- From: <herman.ter.horst@philips.com>
- Date: Thu, 13 Nov 2003 17:20:04 +0100
- To: pat hayes <phayes@ihmc.us>
- Cc: Dan Connolly <connolly@w3.org>, www-rdf-comments@w3.org
>>>>This is a review of part of the RDF Semantics >>>>document, editorial version LC2.5. >>>>In this message I mainly focus on the rdfs entailment lemma. >>>> >>>>The proof of this lemma is based on the claim that >>>>the RDFS Herbrand interpretation of an RDF graph is an >>>>RDFS interpretation. >>>>This claim seems to be false: the first condition for >>>>RDF interpretations is not satisfied. >>>>In order to show this, note that this condition amounts, >>>>in this case, to the equivalence >>>> v in IP iff <v,Property> in IEXT(type). >>>>Suppose that the graph G has triples >>>> v p l >>>>and >>>> p range Property >>>>where l is a plain literal. >>>>By rule lg, the RDFS closure D of G contains the triple >>>> v p b >>>>where b is allocated to l and where b = sur(l). >>>>By rule rdfs3, D contains the triple >>>> b type Property = sur(l) type Property. >>>>Therefore, <l,Property> in IEXT(type). >>>>However, we cannot have l in IP, since that would mean >>>>that D contains the triple >>>> l type Property. >>> >>>Yes. The SH definition of IP should refer to the surrogate, just as >>>the IEXT definition does, ie should read: >>> >>>IP<SH> = {x: D contains a triple sur(x) rdf:type rdf:Property . } >>> >>>then l can (indeed will) be in IP. that is, all the semantic >>>conditions should be 'read off' from the graph via the surrogates. >>> >>>I will make this change. >> >>Clear. This was the problem here. >>Now the first condition on rdf-interpretations holds. >> >>> >>>>=== >>>> >>>>It should be made explicit what the domain and range of the >>>>function sur are: I assume that these sets are both IR. >>> >>>The domain is IR and the range is a subset of IR consisting of >>>vocabulary items. So the range is a subset of IR. >>> >>>>When this assumption is made explicit, there seems to be a >>>>circularity in the definition of LV for the RDFS Herbrand >>>>interpretation: >>>>the definition of LV depends on sur, the definition of sur >>>>depends on IR, the definition of IR depends on LV. >>>>In view of this circularity, the definition of LV becomes >>>>incomprehensible. I believe that the definition of LV should >>>be made explicit. >>> >>>OK, though I don't accept that it is ambiguous at present. >>> >>>>From the given definition, I would guess that the intention >>>>is that LV is the union of five sets: >>>> strings >>>> pairs of strings and language tags >>> >>>ie plain literals, yes... >>> >>>> XML values of well-typed XML literals in D >>>> {v in voc(D): the triple v type Literal is in D } >>>> {v in voc(D): v a typed, non-XML literal such that >>>> b type Literal is in D, where b is the blank node allocated >>>> to v by rule lg } >>> >>>I do not think that this way of phrasing it appropriate. The central >> >intuition is that in a simple Herbrand interpretation, IR consists of >>>the vocabulary items (including bnodes) in the graph, and the >>>interpretation is simply read off the graph. Here we need to modify >>>this by treating some vocabulary items as surrogates for more the >>>special values required by the semantic conditions, and adding >>>required items (all plain literals) which may not be in the graph; >>>otherwise, the construction should mirror the simple Herbrand >>>construction. >> > >>>I propose to rephrase the definition as follows, modeled on the >>>definition used in the RDF lemma: >> > >>>------- >>>If lll is a well-formed XML literal, let xml(lll) be the XML value of >>>lll; and for each XML value xml(lll) of any well-formed XML literal >>>lll in D, let sur(xml(lll)) be the blank node allocated to lll by >>>rule lg; for any other literal lll in D, let sur(lll) be the blank >>>node allocated to lll by rule lg, and extend sur to be the identity >>>mapping on URI references and blank nodes in D. The domain of this >> >mapping is the universe IR<SH>, defined below, and the range contains >>>only URI references and blank nodes which occur in D. >>(Actually the document augments this with the definition that >>sur is also the identity mapping on "other plain literals") >>I still believe that the text contains a circularity that makes the >>construction hard to understand: >>sur is defined with reference to the universe IR "*defined below*", >>then LV is defined in terms of sur, then IR is defined in terms >>of LV. >>It seems that your last paragraph I cite here contains the >>solution to the problem: >>Wouldn't it be clearer to >>- first define IR, as you indicated, in four parts, >>- then define sur: IR -> nodes(D) intersect (URIs u blankNodes) >>- then define LV = plainLiterals u {x in IR: sur(x) type Literal in D} >>- then IP = {x in IR: sur(x) type Property in D} >>etc.? > >I have rewritten the text along these lines but without altering the >actual table (except as noted below). I hope this is now sufficiently >clear. > The text now includes all literals in IR (= the domain of sur). It seems that the well-formed XML literals themselves should not be in; only their values are. Note, for example, that sur is not defined for them. >>See further below. >> >>>------- >>> >>>>=== >>>> >>>>"Define B(x) as before, then clearly [SH+B] satisfies D ..." >>>>There seems to be a problem with this conclusion. >>>>Making this explicit, it seems that B:blank(D)->IR >>>>needs to be defined by >>>>B(v)=xml(l) if v is a blank node allocated to the well-formed >>>>XML literal l, >>>>B(v)=l if v is a blank node allocated to a typed, non-XML >>>>literal l, >>>>otherwise B(v)=v. >>> >>>The wording is careless, forgive me. The intention was that the >>>second case would include ALL other literals, ie non-well-typed XML, >>>other typed and plain, if v has been allocated to that literal. I >>>will spell this out more carefully: >>>----- >>>Define B(x) as follows: if x is a blank node allocated to a >>>well-formed XML literal lll in D then B(x) = xml(lll); if it is >>>allocated to any other literal lll in D then B(x)=lll; and otherwise >>>B(x)=x. >>>----- >>> >>>I am not sure if this point solves your next comment, because I am >>>not sure what the force of the comment is. >> >>This indeed seems to be the required definition of B. >> >>[...] >> >>I have skipped here a rather long sequence of earlier remarks by me >>and comments on those by you. Much becomes clear. >> >>It seems that there remains only one problem in the proof of the >>rdfs entailment lemma, which can be clearly localized. >> >>The claim on which the proof is based, that the RDFS Herbrand >>interpretation of an RDF graph is an RDFS interpretation, >>still seems to be false. >> >>To give an example, let A be a URI reference and let G >>be the RDF graph consisting of just the two triples >> type domain A >> A type Class >>Let l be a plain literal. > >OK. > >>Then A in IC, type in IP, <type,A> in IE(domain) and >><l,Literal> in IEXT(type). >>The semantic condition on domain shows that >>l in CEXT(A), or <l,A> in IEXT(type) >>Therefore D should contain the triple >> sur(l) type A >>Since sur(l)=l, D should contain the triple >> l type A >>which is clearly nonsense. >> >>It seems that similar small counterexamples to the claim >>can be given with other vocabulary instead of domain: >>range, subClassOf, subPropertyOf. >> >>Moreover, the last part of the proof of the rdfs entailment >>lemma has a conclusion that cannot be justified: >>"IEXTI(p) contains <SH+A(s),SH+A(o)> >>i.e. D contains a triple >> sur(SH+A(s)) p sur(SH+A(o))" >>The problem with this conclusion "i.e." is that p can be type >>and <SH+A(s),SH+A(o)> can be <l,Literal> or <l,Resource> >>for some plain literal l. > >not in D, that is, I presume is your point. Yes, point taken. > >>The proof of semantic conditions for domain, range, subClassOf >>and subPropertyOf make a similar unjustified step. >> >>It seems that these problems can be solved by slightly >>generalizing the notion of simple interpretations and >>thereby also the other kinds of interpretations >>(rdf, rdfs, D). >>The assumption that LV contains all plain literals might >>be dropped. > >Indeed. This does greatly simplify everything and, as you say, is not >actually used anywhere. At the time it was written there seemed to be >an expository value in considering plain literals to be 'outside' the >vocabulary, but this small value, if any, is clearly outweighed by >the formal awkwardness that results. I will refrain from drawing the >obvious moral, since it resounds to my disadvantage. > >On reflection I have made the following changes: > >1. A plain literal is considered a name (so vocabularies may contain >plain literals). >2. The definition of simple interpretation of V requires that LV >contain plain literals *in V* >3. The first two semantic conditions on plain literals in a simple >interpretation refer to plain literals *in V*. >4. The definitions of LV in the three Herbrand constructions are >modified appropriately to refer to plain literals in the appropriate >graph (respectively in G, C and D) Not in the third one: there LV is {x in IR : sur(x) t Literal in D} as we noted >5. As you suggest, the special case of IEXT<SH> is dropped. > >I also noted a slip in the definition of the mapping B in the RDF >lemma proof and corrected it: > >Define a mapping B on blank nodes in C as follows: B(x)=xml(lll) if x >is allocated to a well-formed XML literal lll, otherwise B(x)=x Correct. I had already understood it in this way. > >> It seems that the only thing that is essential >>to assume about plain literals is that they denote themselves. >>It seems that nowhere in the entire document the assumption >>is used that an arbitrary string or pair consisting of string >>and literal is actually contained in LV. XMLLiterals and >>other typed literals are not given this priviliged treatment. >>The assumption does not seem match well with a clean and correct >>development of the Herbrand construction in the case of RDFS. > >Agreed. > >>I consider the RDFS entailment lemma to be proved when >>this change would be made to the definition, and when, >>moreover, the following naturally following adjustments >>are made to the construction of the RDFS Herbrand >>interpretation: >>- drop the literals outside the vocabulary of D >>from IR (this is the second part of the definition mentioned >>above, each of these literals is plain) >>- drop the plain literals from LV, the definition would >>become completely analogous to that of IP: >> LV = {x in IR : sur(x) type Literal in D} >>- in the definition of IEXT drop the special case IEXT(type), >>which causes the horrible problems I just mentioned. >> > >Done. So it is proved :-) > >Pat > >> >Pat >>> >>>PS. the changes made described above are now visible in the copy on my >>website: >>> >>>http://www.ihmc.us/users/phayes/RDF_Semantics_LC2.5.html >>> >>>I have also added an explanatory paragraph just before the proof of >>>the RDF entailment lemma, and some explanatory prose in the proof of >>>the RDFS entailment lemma concerning the role of literal surrogates. >>> >>>Please let me know if this response adequately deals with the issues you >>raise. >>> >>>Pat >>> >>>> >>>> >>>>Herman ter Horst >>> >>> >>>-- >>>--------------------------------------------------------------------- >>>IHMC (850)434 8903 or (650)494 3973 home >>>40 South Alcaniz St. (850)202 4416 office >>>Pensacola (850)202 4440 fax >>>FL 32501 (850)291 0667 >>cell >>>phayes@ihmc.us http://www.ihmc.us/users/phayes >>> >>> >> >>Herman > > >-- >--------------------------------------------------------------------- >IHMC (850)434 8903 or (650)494 3973 home >40 South Alcaniz St. (850)202 4416 office >Pensacola (850)202 4440 fax >FL 32501 (850)291 0667 cell >phayes@ihmc.us http://www.ihmc.us/users/phayes > >
Received on Thursday, 13 November 2003 11:22:54 UTC