Re: Input sought on datatyping tradeoff from Geoff Chappell on 2002-07-20 (www-rdf-logic@w3.org from July 2002)

From: Geoff Chappell <geoff@sover.net>
Date: Sat, 20 Jul 2002 09:36:48 -0400
To: "pat hayes" <phayes@ai.uwf.edu>
Cc: <www-rdf-logic@w3.org>
Message-ID: <01d901c22ff2$7ffe2b70$825ec6d1@goat1>
----- Original Message -----
From: "pat hayes" <phayes@ai.uwf.edu>
To: "Geoff Chappell" <geoff@sover.net>
Cc: <www-rdf-logic@w3.org>
Sent: Friday, July 19, 2002 5:43 PM
Subject: Re: Input sought on datatyping tradeoff


>
> >----- Original Message -----
> >From: "Brian McBride" <bwm@hplb.hpl.hp.com>
> >To: "Geoff Chappell" <geoff@sover.net>; <www-rdf-logic@w3.org>
> >Sent: Monday, July 15, 2002 4:46 PM
> >Subject: Re: Input sought on datatyping tradeoff
> >
> >
> >>
> >>  At 09:20 15/07/2002 -0400, Geoff Chappell wrote:
> >>
> >>  >I have a question about datatyping used with untidy literals. Given
test
> >>  >case D:
> >>  >
> >>  >Test D:
> >>  >
> >>  >    <Jenny>      <ageInYears> "10" .
> >>  >    <ageInYears> rdfs:range xsd:decimal .
> >>  >
> >>  >    <John>  <ageInYears>   _:a .
> >>  >    _:a     xsdr:decimal   "10" .
> >>  >
> >>  >
> >>  >My understanding is that in a world of untidy literals, literals are
> >>  >(potentially) ambiguous names. Not only can many literals refer to
one
> >>  >thing, but the same literal can refer to many things (as opposed to
uris
> >>  >which are supposedly unambiguous names - i.e. a uri can only identify
one
> >>  >thing though many uris could refer to the same thing). With this
> >>  >understanding a datatype identifies by uri a black-box that performs
name
> >>  >resolution - i.e. the datatype is able to functionally identify a
> >>  >thing/object/value based solely upon its
> >>
>(potentially-ambiguous-wrt-the-world-at-large-but-not-wrt-the-datatype)
> >>  >name. A datatype has a set of names that it is able to resolve and a
> >>  >corresponding set of things/values.  The members of the datatype
class
> >(when
> >>  >the datatype is used as a class) are simply the things/values it is
able
> >to
> >>  >resolve names to.
> >>
> >>  That is a pretty good summary.  I think you have that right, though
there
> >>  was one place  where I wanted to wordsmith a bit, and there are others
> >>  where the logicians might.  But those would be to picky for our
purpose
> >>  here, I think.
> >>
> >>
> >>  >But what specifically is the meaning of the datatype when used as a
> >>  >property?
> >>
> >>  Associated with the datatype is a property extension which consists of
a
> >>  set of pairs, e.g.
> >>
> >>     { (1, "1"), (2, "2"), ... }
> >>
> >>  This is the way the current model theory works, so there is nothing
> >special
> >>  in this aspect about datatype properties.
> >>
> >>  >Clearly in test D above the first "10" is meant to denote the
> >>  >decimal value 10, as is node _:a. But what does the second "10" (the
> >object
> >>  >of xsdr:decimal) denote?
> >>
> >>  Ignoring complexities referred to by Peter for now, the second "10"
> >denotes
> >>  a string.
> >>  We know its a string because we know xsdr:decimal is a datatype
property
> >>  and all datatype properties take strings as their values.
> >>
> >>  I may be glossing over some technical details here, but this is the
basic
> >idea.
> >>
> >>  >  One possibility is that it is also the decimal 10.
> >>  >Then a datatype used as a property states the equality under the
datatype
> >of
> >>  >the subject and object (which would be enough in this instance for a
> >>  >datatype-aware processor to figure out that _:a denotes the decimal
10).
> >>  >Another possibility might be that it is referring to the name itself
> >(which
> >>  >I guess would make use of a datatype property some sort of a quoting
> >>  >mechanism?). But if that is the case, how is the rdf processor to
know
> >that?
> >>
> >>  Somewhere we have an assertion which I didn't show:
> >>
> >>     xsd:decimal rdf:type rdfd:datatype .
> >>
> >>  >what range constraint on the datatype property would indicate that?
just
> >>  >rdfs:Literal? does rdfs:Literal become a "built-in" datatype that
maps
> >>  >string values to themselves? (I often confuse myself here because in
the
> >>  >whole discussion of tidyness vs untidyness I understand the term
> >"literal"
> >>  >as used to talk about the name/label of the graph node while
> >"rdfs:Literal"
> >>  >obviously is referring to the type of the value - little difference I
> >guess
> >>  >in the world of tidy literals).
> >>
> >>  Just so.
> >>
> >>  Have I done enough to convince you this is possible, or do I need to
call
> >>  in the cavalry?
> >
> >Thanks, you've answered most of my questions. I do have a remaining
> >question - let me try to restate it.
>
> Cavalry arriving late:
>
> >In the untidy world, unlike the tidy world, a literal does not have a
fixed
> >meaning. Since literals can not be subjects of statements, the only way
to
> >constrain the meaning of a literal is to attach range constraints to the
> >property to which the literal is attached
>
> Lets say yes, though there might be some subtle tricks to do it in other
ways.
>
> >(maybe not entirely true? I guess
> >in some non-XML/RDF syntax you might also be able to terminate more than
one
> >arc on a literal node). A range constraint of an rdfs class is not
> >sufficient to license an rdf processor to "know" the value to which the
> >literal refers only to constrain it to one of the members of the class. A
> >datatype constraint does provide enough information (to a processor
intimate
> >with that datatype) to actually fix the value since the datatype provides
a
> >mapping between literals and the members of the datatype class.
> >
> >I guess my remaining question boils down to what licenses an rdf
processor
> >to conclude that the literal on the object side of a datatype property is
> >referring to itself while all other literals are (potentially) referring
to
> >entities other than themselves?
>
> In the untidy option, it never can make that determination. A 'lone'
> literal in the untidy world is like a bnode with a meaningless label
> attached to it: it just says that something exists which has this as
> its lexical rendering; but until you know more about the
> lexical-to-value rules (ie the datatype) that doesn't tell you
> anything.

Sorry to beat what is probably a dead horse at this point, but...

assuming untidy, if

    (1) <Jenny>      <ageInYears> "10" .
    (2) <ageInYears> rdfs:range xsd:decimal .

lets us infer

    (3) <John>  <ageInYears>   _:a .
    (4) _:a     xsd:decimal   "10" .

and given your response (as I understood it), "10" in line (4) is
meaningless without something like:

    (5) <xsd:decimal > rdfs:range xsd:string .
            or
    (5) <xsd:decimal > rdfs:range rdfs:Literal . (assuming Literal is
redefined as a datatype that maps a literal/string to itself)

then can't we infer:

    (6) _:a  xsd:decimal   _:b .
    (7) _:b  xsd:string   "10" .

and then

    (8) _:b  xsd:string   _:c .
    (9) _:c  xsd:string   "10" .

etc, etc.?

Is (5) not needed for the model theory to do its magic? or is there a
special terminating case (i.e. _:x  xsd:string   "nnn" doesn't imply any
additional triples)?

--geoff

>
> In the tidy option, literals *always* denote themselves, and any
> datatyping information doesnt change that: at best, it can be used to
> fix the interpretations of some bnode suitably 'linked' to the
> literal.
>
> There are several 'hybrid' options on the table that try to get the
> best of both worlds. Brian's questions were partly designed to elicit
> intuitions which might have supported one of these. In spite of
> having invented a few of them, I now think they cause more smoke than
> light.
>
> Pat Hayes
>
>
> --
> ---------------------------------------------------------------------
> IHMC (850)434 8903   home
> 40 South Alcaniz St. (850)202 4416   office
> Pensacola,  FL 32501 (850)202 4440   fax
> phayes@ai.uwf.edu
> http://www.coginst.uwf.edu/~phayes
Received on Saturday, 20 July 2002 09:07:24 UTC