Re: RDF Semantics: corrections from Peter F. Patel-Schneider on 2004-01-13 (www-rdf-comments@w3.org from January to March 2004)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Tue, 13 Jan 2004 14:47:35 -0500 (EST)
To: phayes@ihmc.us
Cc: herman.ter.horst@philips.com, jjc@hpl.hp.com, hendler@cs.umd.edu, schreiber@swi.psy.uva.nl, connolly@w3.org, sandro@w3.org, www-rdf-comments@w3.org, bwm@hplb.hpl.hp.com
Message-Id: <20040113.144735.36854689.pfps@research.bell-labs.com>
From: pat hayes <phayes@ihmc.us>
Subject: Re: RDF Semantics: corrections(was: Re: RDF Semantics: two issues, connected to OWL)
Date: Tue, 13 Jan 2004 12:41:33 -0600

> Gentlemen:
> 
> Herman has suggested making a change to the wording of the definition 
> of D-interpretation for the RDF spec., and I would like to make the 
> change he suggests. The only substantive aspect of this change which 
> may effect OWL is that in RDF, D-interpretations would no longer be 
> required to interpret the class extension of the datatype name as 
> being identical to the value space of the datatype (instead, it could 
> be a subset of that class extension.) This would allow a datatype 
> completeness lemma to be proved for RDFS with datatypes. Herman 
> suggests that a single sentence can be added to the OWL spec so as to 
> ensure that this proposed change will not affect the OWL semantics. I 
> am anxious to ensure that this would be acceptable to the OWL mavens. 
> Peter and Jeremy, would this proposed change be acceptable to you? 

Not without some good indication of why the change is being proposed for
RDF and what effect it would have on RDF.  The comment about being able to
prove a new lemma for RDF with the change indicates that there will be some
substantive effect to the change. 

> Jim and Guus, can this additional sentence be inserted into the OWL 
> text so as to ensure compatibility? Note, this is not a change to the 
> content or substance of OWL, only to the text so as to make it secure 
> against a proposed editing of the RDF text. See end of this message 
> for the proposed change to the OWL text in more detail.

A proposed editing of the RDF text?  If there is some noticeable change to
OWL from this proposed changed to RDF, I view the change as a substantive
change to RDF during a period when RDF is not supposed to change.

> As you can appreciate, we need to move very quickly on this decision. 
> I would be extremely grateful if you could give me a quick OK/not-OK 
> on this so that the RDFWG can make a firm decision this Friday. 

I do not have sufficient information to determine whether this change is
benign, and thus would vote against it, if I had a vote.

> Obviously any responsibility for any problems must be mine, but I 
> would be grateful for your feedback. If you say not-OK then I will 
> not make any of these changes, but I think that would be a pity as 
> Herman has indeed identified a place where the RDF spec would be 
> considerably better if his change were made.  

Better in what way?  What are the consequences of this change?

This is literally our 
> last possible chance to get this done. Thanks for your help.
> 
> Pat

peter

> PS. A version of the RDF spec with this change made can be viewed at
> 
> http://www.ihmc.us/users/phayes/RDF_Semantics_2004.html
> particularly at
> http://www.ihmc.us/users/phayes/RDF_Semantics_2004.html#defDinterp

This doesn't help very much without an indication of what the change is.

> --------
> 
> >A few comments in addition to my last message (to which there has
> >been no response).
> 
> Sorry, I have been travelling. Back now.
> 
> >Summary:
> >
> >- It seems that I added too much weight to my 'second amendment'
> >to the proposed correction text.  The generalized completeness
> >lemma (for D-entailment) holds with or without this amendment.
> >Still, this amendment seems to be a simplification of the proposed
> >correction text.
> >
> >- The difference between the PR text and the proposed correction
> >text is, essentially, that the PR definition of D-interpretations
> >assumes, in addition, that the class extension of each datatype d in
> >D equals the value space of d.
> 
> Yes, quite. In effect, the correction weakens the semantics to fit a 
> plausible set of inference principles, hence the completeness. I 
> confess to being slightly worried about this change, although it does 
> indeed make possible a D-completeness result and is, as you point 
> out, very much in the spirit of the rest of the RDFS specification 
> (which makes no other 'iff' assumptions about class extensions).
> 
> >I am proposing to add an additional sentence to the OWL semantics
> >document to ensure that the proposed correction to the RDF Semantics
> >document does not lead to a problem for WebOnt.
> >(This additional sentence is true now for the OWL semantics, and it
> >ensures that OWL S&AS remains correct when RDF Core
> >decides to accept the proposed correction text.)
> 
> I am anxious to ensure that any  corrections I make in the RDF spec 
> do not cause problems for OWL. Is it likely that your proposed 
> changes will be acceptable to the Webont group at this stage, thus 
> reconciling the semantics appropriately?
> 
> 
> >- I have catched an additional, small, slip in the correction text.
> >
> >===
> >
> >>>>On Fri, 2003-12-19 at 17:59, pat hayes wrote:
> >>>>>   [...] So I accept that this addition to
> >>>>>   the text would be an improvement, I do not feel that this is worth,
> >>>>>   as it were, stopping the presses for (even if the presses could be
> >>>>>   stopped, which they cannot at this stage.)
> >>>>
> >>>>The Proposed Rec has gone out, but review of that document is
> >>>>in order thru 19Jan.
> >>>
> >>>Ah, I had not realized that. OK....
> >>>
> >>>>Does anybody feel energized to start a new thread,
> >>>
> >>>Take this as the new thread.
> >>>
> >>>-----
> >>>
> >>>My understanding at present is that there is one outright error in
> >>>the document, an editing slip: the statement of the  RDFS entailment
> >>>lemma in appendix A should read the same as the one in the text:
> >>>"rule lg" -> "rules lg, gl"
> >>>with appropriate links, of course.
> >>
> >>Indeed.
> >>
> >>>
> >>>The remaining comments from Herman are concerned with the way that
> >>>D-interpretations are defined.  After re-reading this correspondence
> >>>I think that the best way to proceed is to adopt Herman's suggested
> >>>rewording (with slight changes) for the RDFS semantic conditions.
> >>(correction to your email, not your proposed text: semantic conditions
> >>for D-interpretations rather than RDFS semantic conditions)
> >>>
> >>>The amended text reads as follows (beginning at the anchor
> >>>http://www.w3.org/TR/rdf-mt/#defDinterp and proceeding for 4
> >>>paragraphs past the table. The text has been changed to properly
> >>>organize the references to 'first condition', 'second condition' and
> >>>so on, but the substance has not been changed.)
> >>
> >>I have two small amendments to the amended text, which I describe
> >>inline below.
> >>
> >>>
> >>>----------
> >>>If D is a datatype map, a D-interpretation of a vocabulary V is any
> >>>rdfs-interpretation I of V union {aaa :<aaa,x> in D} which satisfies
> >>
> >>Here the phrase 'union ...' is a definite improvement to the correction
> >>that I described.  As you mentioned in an earlier message, this is
> >>in line with how the 'logical vocabulary' is dealt with in the
> >>definitions and RDF and RDFS interpretations.
> >>My first amendment to your amended text would be to add,
> >>between {}, "for some x", to make the definition completely explicit,
> >>and to prevent that readers have to make a guess about the definition
> >>of this set.  Then the last line would become:
> >>--- ...of V union {aaa :<aaa,x> in D for some x} ...
> 
> OK
> 
> >  >
> >>>the following extra conditions for every pair <aaa, x> in D:
> >>>
> >>>General semantic conditions for datatypes.
> >>><table>
> >>><firstrow>
> >>>
> >>>I(aaa) = x and I satisfies the triples
> >  >>
> >>>aaa rdf:type rdfs:Datatype .
> >>>aaa rdfs:subClassOf rdfs:Literal .
> >>>
> >>></firstrow>
> >>><secondrow>
> >>>
> >>>if "sss"^^ddd is in V and <aaa, I(ddd)> in D and sss is in the
> >>
> >>There seems to be a point in connection with the addition
> >>"and <aaa, I(ddd)> in D" that you made here.
> >>Since D, as a datatype map, is a function, and since the pair
> >><aaa, x> is already assumed to be in D, the new assumption
> >>"<aaa, I(ddd)> in D" made here is equivalent to I(ddd)=x.
> 
> Yes, that is correct; but I had tried to state the conditions without 
> introducing 'x', so that each condition could be read and understood 
> by itself.
> 
> >  >
> >>It seems to me that nothing essential is lost, and some
> >>simplification is obtained, by replacing the last line by:
> >>
> >>---if "sss"^^aaa is in V and sss is in the
> >>
> 
> This would not then apply to a case like
> 
> ex:foo owl:sameAs xsd:integer .
> ex:a ex:p "35"^^ex:foo .
> 
> which of course does not arise in RDF, but does in OWL.
> 
> There was considerable discussion of cases involving such identity 
> between datatypes, and Peter in particular insisted that the 
> semantics should support this; and I came to agree with him.  I 
> therefore propose to keep the condition as it stands.
> 
> >  >See the last part of this message for additional motivation for
> >>this change.
> >>This second amendment to your amended text leads to several
> >>naturally corresponding changes in the rest of the definition:
> >>---replace  "sss"^^ddd  by  "sss"^^aaa
> >>---replace  I(ddd)  by  x
> >
> >Note for clarity: I did not make these two corresponding
> >changes inline in the text below.
> >
> >>and one line is replaced in the same way as the line above,
> >>see below inline.
> >>
> >>>lexical space of I(ddd) then:
> >>>
> >>>IL("sss"^^ddd) = L2V(x)(sss);
> >>--
> >>>IL("sss"^^ddd) is in LV;
> >>>IEXT(I(rdf:type)) contains <IL("sss"^^ddd), ddd>
> >
> >Here the last 'ddd' should be 'x', just like I noted below in
> >a similar occasion.
> 
> Or I(ddd); yes, thanks.
> 
> >
> >>>
> >>></secondrow>
> >>><thirdrow>
> >>>
> >>>if "sss"^^ddd is in V and <aaa, I(ddd)> in D and sss is not in the
> >>replace by---if "sss"^^aaa is in V and sss is not in the
> >>>lexical space of I(ddd) then:
> >>>
> >>>IL("sss"^^ddd) is not in LV;
> >>>IEXT(I(rdf:type)) does not contain<IL("sss"^^ddd), ddd>
> >>Here the last 'ddd' seems to be a slip; it should have been
> >>I(ddd), and it would be replaced by x.
> >>>
> >>></thirdrow>
> >>></table>
> >>>
> >>>The first condition ensures that I interprets the URI reference
> >>>according to the datatype map provided. Note that this does not
> >>>prevent other URI references from also denoting the same datatype.
> >>>
> >>>The second condition ensures that typed literals in the vocabulary
> >>>respect the datatype lexical-to-value mapping. For example, if I is
> >>>an XSD-interpretation then I("15"^^xsd:decimal) must be the number
> >>>fifteen. The third condition requires that an ill-typed literal,
> >>>where the literal string is not in the lexical space of the datatype,
> >>>not denote any literal value. Intuitively, such a name does not
> >>>denote any value, but in order to avoid the semantic complexities
> >>>which arise from empty names, the semantics requires such a typed
> >>>literal to denote an 'arbitrary' non-literal value. Thus for example,
> >>>if I is an XSD-interpretation, then all that can be concluded about
> >>>I("arthur"^^xsd:decimal) is that it is not in LV, i.e. not in
> >>>ICEXT(I(rdfs:Literal)). An ill-typed literal does not in itself
> >>>constitute an inconsistency, but a graph which entails that an
> >>>ill-typed literal has rdf:type rdfs:Literal, or that an ill-typed XML
> >>>literal has rdf:type rdf:XMLLiteral, would be inconsistent.
> >>>
> >>>Note that the second and third conditions apply only to datatypes in
> >>>the range of D. Typed literals whose type is not in the datatype map
> >>>of the interpretation are treated as before, i.e. as denoting some
> >>>unknown thing. These conditions does not require that the URI
> >>>reference in the typed literal be the same as the associated URI
> >>>reference of the datatype; this allows semantic extensions which can
> >>>express identity conditions on URI references to draw appropriate
> >>>conclusions.
> >>>
> >>>The first condition also ensures that the class rdfs:Datatype
> >>>contains the datatypes used in any satisfying D-interpretation.
> >>>Notice that this is a necessary, but not a sufficient, condition; it
> >>>allows the class I(rdfs:Datatype) to contain other datatypes.
> >>>
> >>>------
> >>>
> >>>If this is considered acceptable to everyone, I propose to make this
> >>>change.  I will in any case undertake to produce a corrected version
> >>>of the text, with anchors etc. in place.
> >>>
> >>>Pat
> >>>
> >>>PS to Herman: I am not so optimistic as you are about proving a
> >>>version of the entailment lemma for D-entailment. The issue here has
> >  >>always been that datatypes are inherently idiosyncratic. For example,
> >>>xsd:boolean has only 2 items in its value space, so for example the
> >>>following is a valid XSD-entailment:
> >>>
> >>>a p "true"^^xsd:boolean .
> >>>a p "false"^^xsd:boolean .
> >>>b type xsd:boolean .
> >>>|-
> >>>a p b .
> >>
> >>This is an entailment when it is assumed that ICEXT(I(boolean))
> >>equals the value space of I(boolean).  Since this assumption is not
> >>made in the above definition of D-interpretations, this XSD-entailment
> >>does not seem to be a D-entailment with the above definition when
> >>boolean is in the datatype map.  See further below.
> >>
> >>>
> >>>but I despair of writing a general set of rules which would be
> >>>sensitive to all possible value-space cardinality conditions.
> >>>Similarly, there are many valid XSD entailments arising from ordering
> >>>constraints on particular value spaces; and of course who knows what
> >>>entailments might arise from yet-to-be-defined datatypes? The real
> >>>issue here is that the L2V(d)(x) constraint is really arbitrarily
> >  >>powerful, even to the point of going beyond first-order (ie R.E.)
> >>>expressivity; and it is up to the particular datatype how much of
> >>>that power it chooses to wield: so I do not think that we can
> >>>possibly prove a general completeness lemma for datatype entailment.
> >>
> >>I agree that a general completeness lemma that incorporates all the
> >>intricacies of existing datatypes does not seem possible.  However,
> >>D-entailment in the above form is a rather weak semantics for datatypes,
> >>which captures a common denominator, and which does not capture
> >>all the idiosyncracies of existing datatypes.
> >>As I remarked in [1], it seems that D-entailment in this form
> >>can be characterized with a generalization of the RDFS entailment
> >>lemma.  See the last part of [1] for more details about this
> >>generalized completeness lemma.
> >>This works because the above conditions on D-interpretations directly
> >>mirror the conditions on XMLLiteral in the definition of RDFS
> >>interpretations.  Entailment rule rdf2 is replaced (augmented)
> >>with a similar rule for each datatype in D.
> >>The assumption on D is that the value spaces are
> >>disjoint and that the lexical-to-value mappings are injective.
> >>For example, the generalized lemma would work with a datatype map
> >>with the four datatypes Boolean, Integer, String, XMLLiteral.
> >>
> >>Note: my second amendment above to your amended text seems to be
> >>necessary for this "RDFS-D entailment lemma" to hold.
> >
> >See below.
> >
> >>Instead of the function 'xml' in the proof of the RDFS entailment
> >>lemma, the proof of the generalized lemma works with a function
> >>that can be called 'val-D', which gives the values for all well-typed
> >>literals for datatypes in D.  With this replacement, an 'RDFS-D
> >>Herbrand interpretation' and a surrogate function sur etc.
> >>can be constructed along the same lines as in the current proof.
> >>
> >>Even though it may be too late to include such a generalized
> >>completeness lemma in the RDF Semantics document, it seems to be
> >>valuable that the definition of D-interpretations is arranged
> >>so that it is true.  Like the current conditions for XMLLiteral,
> >>this semantics is mainly restricted to well/ill typed literals,
> >>and it can be used to extend the current XMLLiteral-related reasoning
> >>to many other datatypes.
> >>As was remarked earlier, semantic extensions can always
> >>make stronger assumptions on datatypes to capture a richer
> >>reasoning strategy, for example to capture more idiosyncracies of
> >>datatypes.  This seems to be in line with the general philosophy
> >>of the RDF Semantics document.
> >
> >Having done a closer inspection of the proof, it turns out that
> >the generalized entailment lemma holds not only when the above
> >'second amendment' to the definition of D-interpretions is assumed,
> >but also without this amendment to your amended text.
> >For the more difficult direction of the proof (completeness),
> >the additional assumption that you introduce (where a literal
> >"sss"^^^ddd is present with ddd unequal aaa and I(ddd)=x )
> >is vacuously satisfied by the 'RDFS-D Herbrand interpretation'
> >mentioned above.
> 
> It is in RDFS. In OWL, however, the presence of equality complicates 
> a Herbrand-style argument,and the more elaborate condition becomes 
> necessary.
> 
> >  The easy direction of the proof (validity of
> >entailment rules) does not use the extra assumption.
> 
> True.
> 
> >So the difference expressed by my second amendment seems semantically
> >neutral: it does not change any entailments.
> >
> >However, this amendment does seem to simplify the definition.
> >Do you have a particular reason in mind to mention the ddd's unequal
> >to aaa in the definition, in the way you indicate?  Where would
> >this be used?
> 
> See above.
> 
> >
> >By the way, the 'RDFS-D Herbrand interpretation' uses the
> >surrogate construction not only for literals but also for
> >datatypes in D:
> >if <aaa, x> is in D, then I(aaa)=x and sur(x)=aaa.
> >
> >==
> >
> >A few details to describe more precisely part of what I said
> >in connection with OWL semantics, in my summary in the first
> >lines of this message:
> >
> >An rdfs-interpretation I of a vocabulary V
> >is a 'PR'-D-interpretation of V if and only if
> >it is a 'corrected'-D-interpretation of V
> >such that for each pair <aaa,x> in D we have
> >ICEXT(x)  =  (value space of x)  subsetOf  LV.
> >
> >This holds with or without the 'second amendment' mentioned above.
> >This follows quickly using only a few defining properties of
> >rdfs-interpretations.  These properties also show
> >that the last part (subsetOf LV) may be left out.
> >
> >>
> >>>
> >>>Pat
> >>>
> >>
> >>Herman
> >>
> >>[1]
> >http://lists.w3.org/Archives/Public/www-rdf-comments/2003OctDec/0233.html
> >
> >Herman
> 
> 
> -- 
> ---------------------------------------------------------------------
> IHMC	(850)434 8903 or (650)494 3973   home
> 40 South Alcaniz St.	(850)202 4416   office
> Pensacola			(850)202 4440   fax
> FL 32501			(850)291 0667    cell
> phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 13 January 2004 14:51:38 UTC