- From: Graham Klyne <GK@NineByNine.org>
- Date: Thu, 31 Jan 2002 21:14:43 +0000
- To: Jeremy Carroll <jjc@hplb.hpl.hp.com>, Sergey Melnik <melnik@db.stanford.edu>, Dan Connolly <connolly@w3.org>, Pat Hayes <phayes@ai.uwf.edu>, Brian McBride <bwm@hplb.hpl.hp.com>
- Cc: www-archive <www-archive@w3.org>
This is a follow-up to my previous response to Pat, in which I run the (slightly tidied) revised MT sketch through the various idioms. (Pat: I agree that the previous version was too weak, but I don't see that it broke self-entailment.) 1. Sketch of model theory (revised) ------------------------- Starting from RDFS-interpretation per Pat's 30-Jan model theory [1]. 1. We have an interpretation consisting of IR, IEXT, IS, ICEXT. I propose to drop the global mapping LX from literals to LV. 2. A datatype interpretation (DT-interpretation) is specified with respect to a specified set of datatypes, DT. DT is a subset of IC, having an additional, externally fixed mapping DTEXT from literals to members of ICEXT(DT): DTEXT = { <literal,value> | value in ICEXT(DT) } 2a. Define a relation ILOBJ on IPxDT. Informally, this indicates DT values that can be used to "interpret" literal values used in the object position of the corresponding property. 3. An interpretation does not assign a specific denotation to literals. Instead, literals are treated like blank nodes with some additional constraints. 4. The interpretation of a statement of the form aaa bbb "foo" . is defined thus: If there exists v in IR and d in DT such that <"foo",v> in DTEXT(d) AND ILOBJ(I(bbb),d), AND <i(aaa),v> in IEXT(I(bbb)) THEN True, otherwise False. This basic scaffolding means that an interpretation can arbitrarily restrict the members of DT that can be used to interpret a literal object of any property. Also, extend the RDFS-interpretation rules for DT-interpretation so that: If ILOBJ(x,y) then <x,y> is in IEXT(I(rdfs:range)) Thus, to be a valid DT-interpretation, the member of DT used to interpret object literals for property x must be in the range of x. This gives us a way of evaluating the truth of a graph that contains literals, without actually saying what the literals denote. 2. Apply model theory to idioms ------------------------------- For the purposes of these examples, I shall assume the existence of three members in DT, dtdate, dtusdate, dtukdate and dtstring that correspond to date values: DTEXT(dtdate) = { : <14-Jul-2001,"2001-07-14">, <15-Jul-2001,"2001-07-15">, <16-Jul-2001,"2001-07-16">, : } DTEXT(dtusdate) = { : <07-Jun-2001,"06/07/2001">, : <06-Jul-2001,"07/06/2001">, : <14-Jul-2001,"07/14/2001">, <15-Jul-2001,"07/15/2001">, <16-Jul-2001,"07/16/2001">, : } DTEXT(dtukdate) = { : <07-Jun-2001,"07/06/2001">, : <06-Jul-2001,"06/07/2001">, : <14-Jul-2001,"14/07/2001">, <15-Jul-2001,"15/07/2001">, <16-Jul-2001,"16/07/2001">, : } DTEXT(dtstring) = { : <"foo","foo"> : <"14/07/2001","14/07/2001">, <"14/07/2001","15/07/2001">, <"14/07/2001","16/07/2001">, : } 2.1 Idiom A (per [2]): person:Jenny exA:birthDate _:a . _:a ex:date "2001-07-15" . This is satisfied if the node _:a denotes 15-Jul-2001, and IEXT(I(ex:birthDate)) contains <I(person:Jenny),15-Jul-2001>, and IEXT(I(ex:date)) contains <15-Jul-2001,"2001-07-15">, and ILOBJ(i(ex:date),dtstring). This is consistent with I(ex:date) == dtdate (described above), where IEXT(dtdate) == DTEXT(dtdate). However, other interpretations could be contrived that also satisfy this. An interpretation in which the string "2001-07-15" is an arithmetic expression and _:a denotes the number 1979 would also satisfy this graph as given. (Does this really leave us any worse off than we were with untyped literals?) If the range of ex:date is specified to be dtstring [**], the scope for creative interpretation is somewhat reduced: in conjunction with the DT-interpretation requirement on ILOBJ this would mean that only dtstring can be used to interpret object literals of dt:date. [**] suggests that a DT-interpretation also needs to indicate a reserved vocabulary for the members of DT? 2.2 Idiom B (per [2]) person:Jenny ex:birthDate "2001-07-15" . ex:birthDate rdfs:range ex:date . This is satisfied if the node "2001-07-15" denotes 15-Jul-2001, and IEXT(I(ex:birthDate)) contains <I(person:Jenny),15-Jul-2001>, and ICEXT(I(ex:date)) contains 15-Jul-2001, and ILOBJ(I(ex:birthDate),I(ex:date)) This is consistent with I(ex:date) == dtdate. The range specification on ex:birthDate prevents any other member of DT being used to interpret the literal unless it also maps the string "2001-07-15" to a value related to I(person:Jenny) by I(ex:bithDate). 2.3 Idiom D (per [2]) (also P per [3]) person:Jenny ex:birthDate _:d . _:d rdf:value "2001-07-15" . _:d rdf:type ex:Date . This is satisfied if the node _:d denotes 15-Jul-2001, and IEXT(I(ex:birthDate)) contains <I(person:Jenny),15-Jul-2001>, and the node "2001-07-15" denotes 15-Jul-2001, and IEXT(I(rdf:value)) contains <15-Jul-2001,15-Jul-2001> and ICEXT(I(ex:date)) contains 15-Jul-2001. This is consistent with I(ex:date) == dtdate, and IEXT(I(rdf:value)) = <v,v> forall v in IR ? However, in this case I can see no way to disambiguate, say: _:d rdf:value "06/07/2001" . and _:d rdf:value "07/06/2001" . because (assuming rdf:value is a generic property) there is no obvious way to make the graph restrict the datatype used to interpret the literals. 2.4 Idiom E (per [2]) person:Jenny ex:birthDate _:e . _:d rdf:type ex:Date . _:d ex:ISO8601 "2001-07-15" . This is satisfied if the node _:e denotes 15-Jul-2001, and IEXT(I(ex:birthDate)) contains <I(person:Jenny),15-Jul-2001>, and ICEXT(I(ex:date)) contains 15-Jul-2001, and the node "2001-07-15" denotes 15-Jul-2001, and IEXT(I(ex:ISO8601)) contains <15-Jul-2001,15-Jul-2001>, and ILOBJ(I(ex:ISO8601),dtdate)) This is consistent with I(exE:date) == dtdate, and the range constraint on ex:ISO8601 restricts On the surface, this is no different from idiom D, but a range constraint on the definition of ex:ISO8601 could be used to restrict the satisfying literals. Suppose the range is dtISO8601, a member of DT. The value space of dtISO8601 would be the same as that of dtdate, but the mapping may be more restricted; DTEXT(dtIS8601) a subset of DTEXT(dtdate); e.g. DTEXT(dtdate) might contain { <15-Jul-2001,"15/07/2001"> <15-Jul-2001,"07/15/2001"> <15-Jul-2001,"2001-07-15"> <15-Jul-2001,"20010715"> : } but DTEXT(dtISO8601) might contain just { <15-Jul-2001,"2001-07-15"> <15-Jul-2001,"20010715"> : } 2.5 Conclusion from fitting idioms All of the above idioms are consistent with a single interpretation of ex:birthDate and ex:Date (the main argument against proposal S): IEXT(I(ex:birthDate)) contains <I(person:Jenny),15-Jul-2001>, i.e. relates Jenny to the date value in ICEXT(dtdate) that is her birth date, and I(ex:date) == dtdate In response to Pat's comments, I've tried to think about the extent to which nonsensical interpretations can be made to satisfy the graphs -- it seems to me that being able to use a rdfs:range top restrict the applicable literal mappings leaves us at least as well of as we were under any of the other proposals. 3. Entailments -------------- I think it's intuitively clear from section 1 that any graph entails itself, without depending on literals being tidy. There's no way to say that a literal means one thing in one instance of a graph, and something different in another instance. Roughly, a literal means any "conforming" value in any graph in which it appears, where "conforming" is defined in terms of the set DT with respect to which an interpretation is defined, which does not change between instances of a graph under the same interpretation. [I'm not sure I know how to prove this formally.] 4. Other issues --------------- Values without literal representations. One of my (lesser) objections to DTL was that it didn't account well for values with no literal representation. By having literals denote values, not pairs, I think that objection disappears. This whole approach leaves open the matter of query semantics, other than allowing that (adapted from [4]): _:f <dc:Title> "10" . <mary> <age> "10" . entails: _:x <dc:Title> _:y . _:z <age> _:y . in the absence of further type constraints, and assuming that there exists a member of DT which relates "10" to some value. What is less clear is what answers one might such a query to actually return, because there is no defined denotation for the literals. One (reasonable) answer would be to simply return the literal (string) and say nothing about its denotation: I think that would correspond to the query semantics that Dan is assuming. I think other answers are possible and reasonable (and out of scope for this group). Backward compatibility with "untyped" RDF. If the set DT always includes a type (say) dtstring (described above), where (say) DTEXT(I(rdfs:Literal)) == dtstring, I think this provides a basis for the kinds of string-based entailment that Dan expects. In the absence of any specific typing information, a literal can always be interpreted as itself. 5. References ------------- [1] Pat Hayes, RDF Model Theory, Jan-2002 http://www.coginst.uwf.edu/users/phayes/w3-rdf-mt-current-draft.html [2] Graham Klyne, RDF Datatyping Desiderata, 25-Jan-2002 http://lists.w3.org/Archives/Public/www-archive/2002Jan/0139.html [3] Sergey Melnik, RDF Datatyping, 18-Jan-2002 http://www-db.stanford.edu/~melnik/rdf/datatyping-20020118/ [4] Dan Connolly, note on datatyping and query-as-entailment, 30-Jan-2002 http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Jan/0440.html -------------------------- __ /\ \ Graham Klyne / \ \ (GK@ACM.ORG) / /\ \ \ / / /\ \ \ / / /__\_\ \ / / /________\ \/___________/
Received on Thursday, 31 January 2002 17:26:05 UTC