Re: Input sought on datatyping tradeoff from Geoff Chappell on 2002-07-16 (www-rdf-logic@w3.org from July 2002)

From: Geoff Chappell <geoff@sover.net>
Date: Mon, 15 Jul 2002 20:18:25 -0400
To: <www-rdf-logic@w3.org>, "Brian McBride" <bwm@hplb.hpl.hp.com>
Message-ID: <013201c22c5e$4db8f470$825ec6d1@goat1>
----- Original Message -----
From: "Brian McBride" <bwm@hplb.hpl.hp.com>
To: "Geoff Chappell" <geoff@sover.net>; <www-rdf-logic@w3.org>
Sent: Monday, July 15, 2002 4:46 PM
Subject: Re: Input sought on datatyping tradeoff


>
> At 09:20 15/07/2002 -0400, Geoff Chappell wrote:
>
> >I have a question about datatyping used with untidy literals. Given test
> >case D:
> >
> >Test D:
> >
> >    <Jenny>      <ageInYears> "10" .
> >    <ageInYears> rdfs:range xsd:decimal .
> >
> >    <John>  <ageInYears>   _:a .
> >    _:a     xsdr:decimal   "10" .
> >
> >
> >My understanding is that in a world of untidy literals, literals are
> >(potentially) ambiguous names. Not only can many literals refer to one
> >thing, but the same literal can refer to many things (as opposed to uris
> >which are supposedly unambiguous names - i.e. a uri can only identify one
> >thing though many uris could refer to the same thing). With this
> >understanding a datatype identifies by uri a black-box that performs name
> >resolution - i.e. the datatype is able to functionally identify a
> >thing/object/value based solely upon its
> >(potentially-ambiguous-wrt-the-world-at-large-but-not-wrt-the-datatype)
> >name. A datatype has a set of names that it is able to resolve and a
> >corresponding set of things/values.  The members of the datatype class
(when
> >the datatype is used as a class) are simply the things/values it is able
to
> >resolve names to.
>
> That is a pretty good summary.  I think you have that right, though there
> was one place  where I wanted to wordsmith a bit, and there are others
> where the logicians might.  But those would be to picky for our purpose
> here, I think.
>
>
> >But what specifically is the meaning of the datatype when used as a
> >property?
>
> Associated with the datatype is a property extension which consists of a
> set of pairs, e.g.
>
>    { (1, "1"), (2, "2"), ... }
>
> This is the way the current model theory works, so there is nothing
special
> in this aspect about datatype properties.
>
> >Clearly in test D above the first "10" is meant to denote the
> >decimal value 10, as is node _:a. But what does the second "10" (the
object
> >of xsdr:decimal) denote?
>
> Ignoring complexities referred to by Peter for now, the second "10"
denotes
> a string.
> We know its a string because we know xsdr:decimal is a datatype property
> and all datatype properties take strings as their values.
>
> I may be glossing over some technical details here, but this is the basic
idea.
>
> >  One possibility is that it is also the decimal 10.
> >Then a datatype used as a property states the equality under the datatype
of
> >the subject and object (which would be enough in this instance for a
> >datatype-aware processor to figure out that _:a denotes the decimal 10).
> >Another possibility might be that it is referring to the name itself
(which
> >I guess would make use of a datatype property some sort of a quoting
> >mechanism?). But if that is the case, how is the rdf processor to know
that?
>
> Somewhere we have an assertion which I didn't show:
>
>    xsd:decimal rdf:type rdfd:datatype .
>
> >what range constraint on the datatype property would indicate that? just
> >rdfs:Literal? does rdfs:Literal become a "built-in" datatype that maps
> >string values to themselves? (I often confuse myself here because in the
> >whole discussion of tidyness vs untidyness I understand the term
"literal"
> >as used to talk about the name/label of the graph node while
"rdfs:Literal"
> >obviously is referring to the type of the value - little difference I
guess
> >in the world of tidy literals).
>
> Just so.
>
> Have I done enough to convince you this is possible, or do I need to call
> in the cavalry?

Thanks, you've answered most of my questions. I do have a remaining
question - let me try to restate it.

In the untidy world, unlike the tidy world, a literal does not have a fixed
meaning. Since literals can not be subjects of statements, the only way to
constrain the meaning of a literal is to attach range constraints to the
property to which the literal is attached (maybe not entirely true? I guess
in some non-XML/RDF syntax you might also be able to terminate more than one
arc on a literal node). A range constraint of an rdfs class is not
sufficient to license an rdf processor to "know" the value to which the
literal refers only to constrain it to one of the members of the class. A
datatype constraint does provide enough information (to a processor intimate
with that datatype) to actually fix the value since the datatype provides a
mapping between literals and the members of the datatype class.

I guess my remaining question boils down to what licenses an rdf processor
to conclude that the literal on the object side of a datatype property is
referring to itself while all other literals are (potentially) referring to
entities other than themselves? Is that just built in to the definition of a
datatype? because of a implicit or explicit datatype range? or is left to
"extra-rdf" inference to make that conclusion?

Geoff

>
> Brian
Received on Monday, 15 July 2002 19:48:54 UTC