Re: RDF Semantics: corrections(was: Re: RDF Semantics: two issues, connected to OWL) from herman.ter.horst@philips.com on 2004-01-08 (www-rdf-comments@w3.org from January to March 2004)

From: <herman.ter.horst@philips.com>
Date: Thu, 8 Jan 2004 13:17:39 +0100
To: pat hayes <phayes@ihmc.us>
Cc: Dan Connolly <connolly@w3.org>, Sandro Hawke <sandro@w3.org>, www-rdf-comments@w3.org
Message-ID: <OF9506D6A5.CBFC976A-ONC1256E15.0031CB0A-C1256E15.00439030@diamond.philips.com>
>>On Fri, 2003-12-19 at 17:59, pat hayes wrote:
>>>  [...] So I accept that this addition to
>>>  the text would be an improvement, I do not feel that this is worth,
>>>  as it were, stopping the presses for (even if the presses could be
>>>  stopped, which they cannot at this stage.)
>>
>>The Proposed Rec has gone out, but review of that document is
>>in order thru 19Jan.
>
>Ah, I had not realized that. OK....
>
>>Does anybody feel energized to start a new thread,
>
>Take this as the new thread.
>
>-----
>
>My understanding at present is that there is one outright error in 
>the document, an editing slip: the statement of the  RDFS entailment 
>lemma in appendix A should read the same as the one in the text:
>"rule lg" -> "rules lg, gl"
>with appropriate links, of course.

Indeed.

>
>The remaining comments from Herman are concerned with the way that 
>D-interpretations are defined.  After re-reading this correspondence 
>I think that the best way to proceed is to adopt Herman's suggested 
>rewording (with slight changes) for the RDFS semantic conditions.
(correction to your email, not your proposed text: semantic conditions
for D-interpretations rather than RDFS semantic conditions)
>
>The amended text reads as follows (beginning at the anchor 
>http://www.w3.org/TR/rdf-mt/#defDinterp and proceeding for 4 
>paragraphs past the table. The text has been changed to properly 
>organize the references to 'first condition', 'second condition' and 
>so on, but the substance has not been changed.)

I have two small amendments to the amended text, which I describe
inline below.

>
>----------
>If D is a datatype map, a D-interpretation of a vocabulary V is any 
>rdfs-interpretation I of V union {aaa :<aaa,x> in D} which satisfies 

Here the phrase 'union ...' is a definite improvement to the correction
that I described.  As you mentioned in an earlier message, this is 
in line with how the 'logical vocabulary' is dealt with in the 
definitions and RDF and RDFS interpretations. 
My first amendment to your amended text would be to add,
between {}, "for some x", to make the definition completely explicit,
and to prevent that readers have to make a guess about the definition
of this set.  Then the last line would become:
--- ...of V union {aaa :<aaa,x> in D for some x} ...

>the following extra conditions for every pair <aaa, x> in D:
>
>General semantic conditions for datatypes.
><table>
><firstrow>
>
>I(aaa) = x and I satisfies the triples
>
>aaa rdf:type rdfs:Datatype .
>aaa rdfs:subClassOf rdfs:Literal .
>
></firstrow>
><secondrow>
>
>if "sss"^^ddd is in V and <aaa, I(ddd)> in D and sss is in the 

There seems to be a point in connection with the addition
"and <aaa, I(ddd)> in D" that you made here.
Since D, as a datatype map, is a function, and since the pair 
<aaa, x> is already assumed to be in D, the new assumption
"<aaa, I(ddd)> in D" made here is equivalent to I(ddd)=x.

It seems to me that nothing essential is lost, and some 
simplification is obtained, by replacing the last line by:

---if "sss"^^aaa is in V and sss is in the

See the last part of this message for additional motivation for
this change.
This second amendment to your amended text leads to several 
naturally corresponding changes in the rest of the definition:
---replace  "sss"^^ddd  by  "sss"^^aaa
---replace  I(ddd)  by  x
and one line is replaced in the same way as the line above,
see below inline. 

>lexical space of I(ddd) then:
>
>IL("sss"^^ddd) = L2V(x)(sss);
--
>IL("sss"^^ddd) is in LV;
>IEXT(I(rdf:type)) contains <IL("sss"^^ddd), ddd>
>
></secondrow>
><thirdrow>
>
>if "sss"^^ddd is in V and <aaa, I(ddd)> in D and sss is not in the 
replace by---if "sss"^^aaa is in V and sss is not in the 
>lexical space of I(ddd) then:
>
>IL("sss"^^ddd) is not in LV;
>IEXT(I(rdf:type)) does not contain<IL("sss"^^ddd), ddd>
Here the last 'ddd' seems to be a slip; it should have been
I(ddd), and it would be replaced by x.
>
></thirdrow>
></table>
>
>The first condition ensures that I interprets the URI reference 
>according to the datatype map provided. Note that this does not 
>prevent other URI references from also denoting the same datatype.
>
>The second condition ensures that typed literals in the vocabulary 
>respect the datatype lexical-to-value mapping. For example, if I is 
>an XSD-interpretation then I("15"^^xsd:decimal) must be the number 
>fifteen. The third condition requires that an ill-typed literal, 
>where the literal string is not in the lexical space of the datatype, 
>not denote any literal value. Intuitively, such a name does not 
>denote any value, but in order to avoid the semantic complexities 
>which arise from empty names, the semantics requires such a typed 
>literal to denote an 'arbitrary' non-literal value. Thus for example, 
>if I is an XSD-interpretation, then all that can be concluded about 
>I("arthur"^^xsd:decimal) is that it is not in LV, i.e. not in 
>ICEXT(I(rdfs:Literal)). An ill-typed literal does not in itself 
>constitute an inconsistency, but a graph which entails that an 
>ill-typed literal has rdf:type rdfs:Literal, or that an ill-typed XML 
>literal has rdf:type rdf:XMLLiteral, would be inconsistent.
>
>Note that the second and third conditions apply only to datatypes in 
>the range of D. Typed literals whose type is not in the datatype map 
>of the interpretation are treated as before, i.e. as denoting some 
>unknown thing. These conditions does not require that the URI 
>reference in the typed literal be the same as the associated URI 
>reference of the datatype; this allows semantic extensions which can 
>express identity conditions on URI references to draw appropriate 
>conclusions.
>
>The first condition also ensures that the class rdfs:Datatype 
>contains the datatypes used in any satisfying D-interpretation. 
>Notice that this is a necessary, but not a sufficient, condition; it 
>allows the class I(rdfs:Datatype) to contain other datatypes.
>
>------
>
>If this is considered acceptable to everyone, I propose to make this 
>change.  I will in any case undertake to produce a corrected version 
>of the text, with anchors etc. in place.
>
>Pat
>
>PS to Herman: I am not so optimistic as you are about proving a 
>version of the entailment lemma for D-entailment. The issue here has 
>always been that datatypes are inherently idiosyncratic. For example, 
>xsd:boolean has only 2 items in its value space, so for example the 
>following is a valid XSD-entailment:
>
>a p "true"^^xsd:boolean .
>a p "false"^^xsd:boolean .
>b type xsd:boolean .
>|-
>a p b .

This is an entailment when it is assumed that ICEXT(I(boolean))
equals the value space of I(boolean).  Since this assumption is not
made in the above definition of D-interpretations, this XSD-entailment
does not seem to be a D-entailment with the above definition when 
boolean is in the datatype map.  See further below.

>
>but I despair of writing a general set of rules which would be 
>sensitive to all possible value-space cardinality conditions. 
>Similarly, there are many valid XSD entailments arising from ordering 
>constraints on particular value spaces; and of course who knows what 
>entailments might arise from yet-to-be-defined datatypes? The real 
>issue here is that the L2V(d)(x) constraint is really arbitrarily 
>powerful, even to the point of going beyond first-order (ie R.E.) 
>expressivity; and it is up to the particular datatype how much of 
>that power it chooses to wield: so I do not think that we can 
>possibly prove a general completeness lemma for datatype entailment.

I agree that a general completeness lemma that incorporates all the
intricacies of existing datatypes does not seem possible.  However,
D-entailment in the above form is a rather weak semantics for datatypes,
which captures a common denominator, and which does not capture
all the idiosyncracies of existing datatypes.
As I remarked in [1], it seems that D-entailment in this form
can be characterized with a generalization of the RDFS entailment
lemma.  See the last part of [1] for more details about this
generalized completeness lemma.
This works because the above conditions on D-interpretations directly
mirror the conditions on XMLLiteral in the definition of RDFS
interpretations.  Entailment rule rdf2 is replaced (augmented)
with a similar rule for each datatype in D.
The assumption on D is that the value spaces are 
disjoint and that the lexical-to-value mappings are injective.
For example, the generalized lemma would work with a datatype map
with the four datatypes Boolean, Integer, String, XMLLiteral.

Note: my second amendment above to your amended text seems to be
necessary for this "RDFS-D entailment lemma" to hold.
Instead of the function 'xml' in the proof of the RDFS entailment
lemma, the proof of the generalized lemma works with a function
that can be called 'val-D', which gives the values for all well-typed
literals for datatypes in D.  With this replacement, an 'RDFS-D
Herbrand interpretation' and a surrogate function sur etc.
can be constructed along the same lines as in the current proof.

Even though it may be too late to include such a generalized
completeness lemma in the RDF Semantics document, it seems to be
valuable that the definition of D-interpretations is arranged
so that it is true.  Like the current conditions for XMLLiteral,
this semantics is mainly restricted to well/ill typed literals,
and it can be used to extend the current XMLLiteral-related reasoning 
to many other datatypes.
As was remarked earlier, semantic extensions can always
make stronger assumptions on datatypes to capture a richer
reasoning strategy, for example to capture more idiosyncracies of
datatypes.  This seems to be in line with the general philosophy
of the RDF Semantics document.

>
>Pat
>

Herman

[1] http://lists.w3.org/Archives/Public/www-rdf-comments/2003OctDec/0233.html
Received on Thursday, 8 January 2004 07:18:07 UTC