Re: RDF Semantics: corrections(was: Re: RDF Semantics: two issues, connected to OWL) from herman.ter.horst@philips.com on 2004-01-13 (www-rdf-comments@w3.org from January to March 2004)

From: <herman.ter.horst@philips.com>
Date: Tue, 13 Jan 2004 17:11:53 +0100
To: pat hayes <phayes@ihmc.us>
Cc: Dan Connolly <connolly@w3.org>, Sandro Hawke <sandro@w3.org>, www-rdf-comments@w3.org
Message-ID: <OF650A9B57.8E356B2F-ONC1256E1A.0051FFE4-C1256E1A.0059015A@diamond.philips.com>
A few comments in addition to my last message (to which there has
been no response).

Summary:

- It seems that I added too much weight to my 'second amendment'
to the proposed correction text.  The generalized completeness
lemma (for D-entailment) holds with or without this amendment.
Still, this amendment seems to be a simplification of the proposed
correction text.

- The difference between the PR text and the proposed correction
text is, essentially, that the PR definition of D-interpretations 
assumes, in addition, that the class extension of each datatype d in
D equals the value space of d.
I am proposing to add an additional sentence to the OWL semantics
document to ensure that the proposed correction to the RDF Semantics
document does not lead to a problem for WebOnt. 
(This additional sentence is true now for the OWL semantics, and it 
ensures that OWL S&AS remains correct when RDF Core
decides to accept the proposed correction text.)

- I have catched an additional, small, slip in the correction text.

===

>>>On Fri, 2003-12-19 at 17:59, pat hayes wrote:
>>>>  [...] So I accept that this addition to
>>>>  the text would be an improvement, I do not feel that this is worth,
>>>>  as it were, stopping the presses for (even if the presses could be
>>>>  stopped, which they cannot at this stage.)
>>>
>>>The Proposed Rec has gone out, but review of that document is
>>>in order thru 19Jan.
>>
>>Ah, I had not realized that. OK....
>>
>>>Does anybody feel energized to start a new thread,
>>
>>Take this as the new thread.
>>
>>-----
>>
>>My understanding at present is that there is one outright error in 
>>the document, an editing slip: the statement of the  RDFS entailment 
>>lemma in appendix A should read the same as the one in the text:
>>"rule lg" -> "rules lg, gl"
>>with appropriate links, of course.
>
>Indeed.
>
>>
>>The remaining comments from Herman are concerned with the way that 
>>D-interpretations are defined.  After re-reading this correspondence 
>>I think that the best way to proceed is to adopt Herman's suggested 
>>rewording (with slight changes) for the RDFS semantic conditions.
>(correction to your email, not your proposed text: semantic conditions
>for D-interpretations rather than RDFS semantic conditions)
>>
>>The amended text reads as follows (beginning at the anchor 
>>http://www.w3.org/TR/rdf-mt/#defDinterp and proceeding for 4 
>>paragraphs past the table. The text has been changed to properly 
>>organize the references to 'first condition', 'second condition' and 
>>so on, but the substance has not been changed.)
>
>I have two small amendments to the amended text, which I describe
>inline below.
>
>>
>>----------
>>If D is a datatype map, a D-interpretation of a vocabulary V is any 
>>rdfs-interpretation I of V union {aaa :<aaa,x> in D} which satisfies 
>
>Here the phrase 'union ...' is a definite improvement to the correction
>that I described.  As you mentioned in an earlier message, this is 
>in line with how the 'logical vocabulary' is dealt with in the 
>definitions and RDF and RDFS interpretations. 
>My first amendment to your amended text would be to add,
>between {}, "for some x", to make the definition completely explicit,
>and to prevent that readers have to make a guess about the definition
>of this set.  Then the last line would become:
>--- ...of V union {aaa :<aaa,x> in D for some x} ...
>
>>the following extra conditions for every pair <aaa, x> in D:
>>
>>General semantic conditions for datatypes.
>><table>
>><firstrow>
>>
>>I(aaa) = x and I satisfies the triples
>>
>>aaa rdf:type rdfs:Datatype .
>>aaa rdfs:subClassOf rdfs:Literal .
>>
>></firstrow>
>><secondrow>
>>
>>if "sss"^^ddd is in V and <aaa, I(ddd)> in D and sss is in the 
>
>There seems to be a point in connection with the addition
>"and <aaa, I(ddd)> in D" that you made here.
>Since D, as a datatype map, is a function, and since the pair 
><aaa, x> is already assumed to be in D, the new assumption
>"<aaa, I(ddd)> in D" made here is equivalent to I(ddd)=x.
>
>It seems to me that nothing essential is lost, and some 
>simplification is obtained, by replacing the last line by:
>
>---if "sss"^^aaa is in V and sss is in the
>
>See the last part of this message for additional motivation for
>this change.
>This second amendment to your amended text leads to several 
>naturally corresponding changes in the rest of the definition:
>---replace  "sss"^^ddd  by  "sss"^^aaa
>---replace  I(ddd)  by  x

Note for clarity: I did not make these two corresponding
changes inline in the text below.

>and one line is replaced in the same way as the line above,
>see below inline. 
>
>>lexical space of I(ddd) then:
>>
>>IL("sss"^^ddd) = L2V(x)(sss);
>--
>>IL("sss"^^ddd) is in LV;
>>IEXT(I(rdf:type)) contains <IL("sss"^^ddd), ddd>

Here the last 'ddd' should be 'x', just like I noted below in
a similar occasion.

>>
>></secondrow>
>><thirdrow>
>>
>>if "sss"^^ddd is in V and <aaa, I(ddd)> in D and sss is not in the 
>replace by---if "sss"^^aaa is in V and sss is not in the 
>>lexical space of I(ddd) then:
>>
>>IL("sss"^^ddd) is not in LV;
>>IEXT(I(rdf:type)) does not contain<IL("sss"^^ddd), ddd>
>Here the last 'ddd' seems to be a slip; it should have been
>I(ddd), and it would be replaced by x.
>>
>></thirdrow>
>></table>
>>
>>The first condition ensures that I interprets the URI reference 
>>according to the datatype map provided. Note that this does not 
>>prevent other URI references from also denoting the same datatype.
>>
>>The second condition ensures that typed literals in the vocabulary 
>>respect the datatype lexical-to-value mapping. For example, if I is 
>>an XSD-interpretation then I("15"^^xsd:decimal) must be the number 
>>fifteen. The third condition requires that an ill-typed literal, 
>>where the literal string is not in the lexical space of the datatype, 
>>not denote any literal value. Intuitively, such a name does not 
>>denote any value, but in order to avoid the semantic complexities 
>>which arise from empty names, the semantics requires such a typed 
>>literal to denote an 'arbitrary' non-literal value. Thus for example, 
>>if I is an XSD-interpretation, then all that can be concluded about 
>>I("arthur"^^xsd:decimal) is that it is not in LV, i.e. not in 
>>ICEXT(I(rdfs:Literal)). An ill-typed literal does not in itself 
>>constitute an inconsistency, but a graph which entails that an 
>>ill-typed literal has rdf:type rdfs:Literal, or that an ill-typed XML 
>>literal has rdf:type rdf:XMLLiteral, would be inconsistent.
>>
>>Note that the second and third conditions apply only to datatypes in 
>>the range of D. Typed literals whose type is not in the datatype map 
>>of the interpretation are treated as before, i.e. as denoting some 
>>unknown thing. These conditions does not require that the URI 
>>reference in the typed literal be the same as the associated URI 
>>reference of the datatype; this allows semantic extensions which can 
>>express identity conditions on URI references to draw appropriate 
>>conclusions.
>>
>>The first condition also ensures that the class rdfs:Datatype 
>>contains the datatypes used in any satisfying D-interpretation. 
>>Notice that this is a necessary, but not a sufficient, condition; it 
>>allows the class I(rdfs:Datatype) to contain other datatypes.
>>
>>------
>>
>>If this is considered acceptable to everyone, I propose to make this 
>>change.  I will in any case undertake to produce a corrected version 
>>of the text, with anchors etc. in place.
>>
>>Pat
>>
>>PS to Herman: I am not so optimistic as you are about proving a 
>>version of the entailment lemma for D-entailment. The issue here has 
>>always been that datatypes are inherently idiosyncratic. For example, 
>>xsd:boolean has only 2 items in its value space, so for example the 
>>following is a valid XSD-entailment:
>>
>>a p "true"^^xsd:boolean .
>>a p "false"^^xsd:boolean .
>>b type xsd:boolean .
>>|-
>>a p b .
>
>This is an entailment when it is assumed that ICEXT(I(boolean))
>equals the value space of I(boolean).  Since this assumption is not
>made in the above definition of D-interpretations, this XSD-entailment
>does not seem to be a D-entailment with the above definition when 
>boolean is in the datatype map.  See further below.
>
>>
>>but I despair of writing a general set of rules which would be 
>>sensitive to all possible value-space cardinality conditions. 
>>Similarly, there are many valid XSD entailments arising from ordering 
>>constraints on particular value spaces; and of course who knows what 
>>entailments might arise from yet-to-be-defined datatypes? The real 
>>issue here is that the L2V(d)(x) constraint is really arbitrarily 
>>powerful, even to the point of going beyond first-order (ie R.E.) 
>>expressivity; and it is up to the particular datatype how much of 
>>that power it chooses to wield: so I do not think that we can 
>>possibly prove a general completeness lemma for datatype entailment.
>
>I agree that a general completeness lemma that incorporates all the
>intricacies of existing datatypes does not seem possible.  However,
>D-entailment in the above form is a rather weak semantics for datatypes,
>which captures a common denominator, and which does not capture
>all the idiosyncracies of existing datatypes.
>As I remarked in [1], it seems that D-entailment in this form
>can be characterized with a generalization of the RDFS entailment
>lemma.  See the last part of [1] for more details about this
>generalized completeness lemma.
>This works because the above conditions on D-interpretations directly
>mirror the conditions on XMLLiteral in the definition of RDFS
>interpretations.  Entailment rule rdf2 is replaced (augmented)
>with a similar rule for each datatype in D.
>The assumption on D is that the value spaces are 
>disjoint and that the lexical-to-value mappings are injective.
>For example, the generalized lemma would work with a datatype map
>with the four datatypes Boolean, Integer, String, XMLLiteral.
>
>Note: my second amendment above to your amended text seems to be
>necessary for this "RDFS-D entailment lemma" to hold.

See below.

>Instead of the function 'xml' in the proof of the RDFS entailment
>lemma, the proof of the generalized lemma works with a function
>that can be called 'val-D', which gives the values for all well-typed
>literals for datatypes in D.  With this replacement, an 'RDFS-D
>Herbrand interpretation' and a surrogate function sur etc.
>can be constructed along the same lines as in the current proof.
>
>Even though it may be too late to include such a generalized
>completeness lemma in the RDF Semantics document, it seems to be
>valuable that the definition of D-interpretations is arranged
>so that it is true.  Like the current conditions for XMLLiteral,
>this semantics is mainly restricted to well/ill typed literals,
>and it can be used to extend the current XMLLiteral-related reasoning 
>to many other datatypes.
>As was remarked earlier, semantic extensions can always
>make stronger assumptions on datatypes to capture a richer
>reasoning strategy, for example to capture more idiosyncracies of
>datatypes.  This seems to be in line with the general philosophy
>of the RDF Semantics document.

Having done a closer inspection of the proof, it turns out that 
the generalized entailment lemma holds not only when the above
'second amendment' to the definition of D-interpretions is assumed,
but also without this amendment to your amended text.
For the more difficult direction of the proof (completeness),
the additional assumption that you introduce (where a literal
"sss"^^^ddd is present with ddd unequal aaa and I(ddd)=x )
is vacuously satisfied by the 'RDFS-D Herbrand interpretation'
mentioned above.  The easy direction of the proof (validity of
entailment rules) does not use the extra assumption.
So the difference expressed by my second amendment seems semantically 
neutral: it does not change any entailments.

However, this amendment does seem to simplify the definition.
Do you have a particular reason in mind to mention the ddd's unequal
to aaa in the definition, in the way you indicate?  Where would
this be used?

By the way, the 'RDFS-D Herbrand interpretation' uses the
surrogate construction not only for literals but also for 
datatypes in D: 
if <aaa, x> is in D, then I(aaa)=x and sur(x)=aaa.

==

A few details to describe more precisely part of what I said
in connection with OWL semantics, in my summary in the first 
lines of this message:

An rdfs-interpretation I of a vocabulary V
is a 'PR'-D-interpretation of V if and only if 
it is a 'corrected'-D-interpretation of V
such that for each pair <aaa,x> in D we have
ICEXT(x)  =  (value space of x)  subsetOf  LV.

This holds with or without the 'second amendment' mentioned above.
This follows quickly using only a few defining properties of
rdfs-interpretations.  These properties also show
that the last part (subsetOf LV) may be left out.

>
>>
>>Pat
>>
>
>Herman
>
>[1] 
http://lists.w3.org/Archives/Public/www-rdf-comments/2003OctDec/0233.html

Herman
Received on Tuesday, 13 January 2004 11:12:17 UTC