- From: Graham Klyne <GK@NineByNine.org>
- Date: Thu, 31 Jan 2002 11:06:48 +0000
- To: Jeremy Carroll <jjc@hplb.hpl.hp.com>, Sergey Melnik <melnik@db.stanford.edu>, Dan Connolly <connolly@w3.org>, Pat Hayes <phayes@ai.uwf.edu>, Brian McBride <bwm@hplb.hpl.hp.com>
- Cc: www-archive <www-archive@w3.org>
I think I may have a datyping approach that satisfies all objections...
I've been thinking about S and TDL, and think there may be a way of dealing
with literals that (a) doesn't depend on tidiness in the RDF graph (or
meets TDL goals while allowing tidiness on literals), and (b) satisfies
Dan's concerns about entailment, etc. I'm sending this off-list simply to
keep the list-noise level down if the idea turns out to be worse than
half-baked. Feel free to forward this (or the WWW-archive reference) to
any interested party.
Basically, it's my attempt to fit a model theory to TDL, but it is not TDL
because the denotation of a literal is a simple value, not a pair...
1. Sketch of model theory
-------------------------
Starting from RDFS-interpretation per Pat's 28-Jan model theory [1].
1. We have an interpretation consisting of IR, IEXT, IS, ICEXT. I propose
to drop the global mapping LX from literals to LV.
2. A datatype interpretation (DT-interpretation) is specified with respect
to a specified set of datatypes, DT. DT is a subset of IC, having an
additional, externally fixed mapping DTEXT from literals to members of
ICEXT(DT):
DEXT = { <literal,value> | value in ICEXT(DT) }
3. [Big change from current MT] An interpretation does not assign a
specific denotation to literals. Instead, literals are treated like blank
nodes with some additional constraints.
4. The interpretation of a statement of the form:
aaa bbb "foo" .
is defined thus:
if there exists v such that <"foo",v> in DTEXT(d) for some d in DT, AND
<i(aaa),v> in IEXT(I(bbb)) THEN True, otherwise False.
This gives us a way of evaluating the truth of a graph that contains
literals, without actually saying what the literals denote. I think this
resolves the various entailment issues, but still leaves some work to do
for queries. Let's try it out on some idioms and examples:
2. Apply model theory to idioms
-------------------------------
For the purposes of these examples, dtdate is the member of DT
corresponding to date values, and 15-Jun-2001 is a date value corresponding
to Jenny's birthDate, so we have <15-Jul-2001,"2001-07-15"> in
DTEXT(dtdate). I shall assume below these values all exist in IR for a
satisfying interpretation.
2.1 Idiom A (per [2]):
person:Jenny exA:birthDate _:a .
_:a exA:date "2001-07-15" .
This is satisfied if the node _:a denotes 15-Jul-2001,
and IEXT(I(exA:birthDate)) contains <I(person:Jenny),15-Jul-2001>,
and IEXT(I(exA:date)) contains <15-Jul-2001,"2001-07-15">.
This is consistent with I(exA:date) == dtdate (described above),
where IEXT(dtdate) == DTEXT(dtdate).
2.2 Idiom B (per [2])
person:Jenny exB:birthDate "2001-07-15" .
exB:birthDate rdfs:range exB:date .
This is satisfied if the node "2001-07-15" denotes 15-Jul-2001,
and IEXT(I(exB:birthDate)) contains <I(person:Jenny),15-Jul-2001>,
and ICEXT(I(exB:date)) contains 15-Jul-2001.
This is consistent with I(exB:date) == dtdate.
2.3 Idiom D (per [2]) (also P per [3])
person:Jenny exD:birthDate _:d .
_:d rdf:value "2001-07-15" .
_:d rdf:type exD:Date .
This is satisfied if the node _:d denotes 15-Jul-2001,
and IEXT(I(exD:birthDate)) contains <I(person:Jenny),15-Jul-2001>,
and the node "2001-07-15" denotes 15-Jul-2001,
and IEXT(I(rdf:value)) contains <15-Jul-2001,15-Jul-2001>
and ICEXT(I(exD:date)) contains 15-Jul-2001.
This is consistent with I(exD:date) == dtdate,
and IEXT(I(rdf:value)) = <v,v> forall v in IR ?
2.4 Idiom E (per [2])
person:Jenny exE:birthDate _:e .
_:d rdf:type exE:Date .
_:d exE:ISO8601 "2001-07-15" .
This is satisfied if the node _:e denotes 15-Jul-2001,
and IEXT(I(exE:birthDate)) contains <I(person:Jenny),15-Jul-2001>,
and ICEXT(I(exE:date)) contains 15-Jul-2001,
and the node "2001-07-15" denotes 15-Jul-2001,
and IEXT(I(exE:ISO8601)) contains <15-Jul-2001,15-Jul-2001>
This is consistent with I(exE:date) == dtdate.
On the surface, this is no different from idiom D, but a range constraint
on the definition of exE:ISO8601 could be used to restrict the satisfying
literals. Suppose the range is dtISO8601, a member of DT. The value space
of dtISO8601 would be the same as that of dtdate, but the mapping may be
more restricted; DTEXT(dtIS8601) a subset of DTEXT(dtdate); e.g.
DTEXT(dtdate) might contain
{ <15-Jul-2001,"15/07/2001">
<15-Jul-2001,"07/15/2001">
<15-Jul-2001,"2001-07-15">
<15-Jul-2001,"20010715">
: }
but DTEXT(dtISO8601) might contain just
{ <15-Jul-2001,"2001-07-15">
<15-Jul-2001,"20010715">
: }
(By suggesting US and non-US date literals, I've just slipped in the idea
that DTEXT may not be functional from literals to values -- nothing I've
done so far depends on that, I think.)
2.5 Conclusion from fitting idioms
All of the above idioms are consistent with a single interpretation of
ex?:birthDate and ex?:Date (the main argument against proposal S):
IEXT(I(ex:birthDate)) contains <I(person:Jenny),15-Jul-2001>, i.e. relates
Jenny to the date value in ICEXT(dtdate) that is her birth date, and
I(ex:date) == dtdate
3. Entailments
--------------
I think it's intuitively clear from section 1 this that any graph entails
itself, without depending on literals being tidy. There's no way to say
that a literal means one thing in one instance of a graph, and something
different in another instance.
Roughly, a literal means any "conforming" value in any graph in which it
appears, where "conforming" is defined in terms of the set DT with respect
to which an interpretation is defined, which does not change between
instances of a graph under the same interpretation.
[I'm not sure I know how to prove this formally.]
4. Other issues
---------------
Values without literal representations. One of my (lesser) objections to
DTL was that it didn't account well for values with no literal
representation. By having literals denote values, not pairs, I think that
objection disappears.
This whole approach leaves open the matter of query semantics, other than
allowing that (adapted from [4]):
_:f <dc:Title> "10" .
<mary> <age> "10" .
entails:
_:x <dc:Title> _:y .
_:z <age> _:y .
in the absence of further type constraints, and assuming that there exists
a member of DT which relates "10" to some value. What is less clear is
what answers one might such a query to actually return, because there is no
defined denotation for the literals. One (reasonable) answer would be to
simply return the literal (string) and say nothing about its denotation: I
think that would correspond to the query semantics that Dan is assuming. I
think other answers are possible and reasonable (and out of scope for this
group).
5. References
-------------
[1] Pat Hayes, RDF Model Theory, Jan-2002
http://www.coginst.uwf.edu/users/phayes/w3-rdf-mt-current-draft.html
[2] Graham Klyne, RDF Datatyping Desiderata, 25-Jan-2002
http://lists.w3.org/Archives/Public/www-archive/2002Jan/0139.html
[3] Sergey Melnik, RDF Datatyping, 18-Jan-2002
http://www-db.stanford.edu/~melnik/rdf/datatyping-20020118/
[4] Dan Connolly, note on datatyping and query-as-entailment, 30-Jan-2002
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Jan/0440.html
--------------------------
__
/\ \ Graham Klyne
/ \ \ (GK@ACM.ORG)
/ /\ \ \
/ / /\ \ \
/ / /__\_\ \
/ / /________\
\/___________/
Received on Thursday, 31 January 2002 06:10:00 UTC