Re: RDF datatyping from Graham Klyne on 2002-01-10 (w3c-rdfcore-wg@w3.org from January 2002)

From: Graham Klyne <GK@NineByNine.org>
Date: Thu, 10 Jan 2002 21:57:33 +0000
To: Patrick Stickler <patrick.stickler@nokia.com>
Cc: RDF Core <w3c-rdfcore-wg@w3.org>, Sergey Melnik <melnik@db.stanford.edu>
Message-Id: <5.1.0.14.2.20020110210546.0389f060@joy.songbird.com>
Patrick,

I'm going to focus on your examples, because they stand the best chance of 
bringing us to some kind of common understanding...

NOTE:  for those who are not following this exchange in detail, this 
message suggests a possible formal interpretation for rdf:value.


At 05:56 PM 1/10/02 +0200, Patrick Stickler wrote:
>If you mean allowing literal nodes to be subjects, per P+, then I
>fully agree. But certainly the D idiom is far easier to contract
>to an eventual P+ representation when that time comes. I.e. going
>to P+
>
>    Bob ex:age _:1:"30" .
>    _:1 rdf:type xsd:integer .
>
>from D
>
>    Bob ex:age _:1 .
>    _:1 rdf:value "30" .    (simnply move to anon-node label)
>    _:1 rdf:type xsd:integer .
>
>is far more straightforward than from S

Well, the first of those could be pure S (if literal subjects are allowed):

   xsd:integer would be understood to be a class containing the lexical 
space of integers.

And the second of those is entirely consistent with S if we also allow a 
very simple definition of rdf:value to be understood;  i.e. that it is the 
identity predicate.  Thus:

   FORALL v: <v,v> in IEXT(I(rdf:value))

(The rdf:value relational extension could be limited to literal strings, 
but why bother?)


>    Bob ex:age _:1 .
>    _:1 xsd:integer.map "30" .
>    xsd:integer.map rdfs:range xsd:integer.lex .
>    xsd:integer.map rdfs:domain xsd:integer.val .

Well, I think that for "ordinary" use, it wouldn't be necessary to include the
rdfs:range and rdfs:domain statements;  i.e.  just say:

    Bob ex:age _:1 .
    _:1 xsd:integer.map "30" .

Since, at some level, the meaning of xsd:integer.map must be understood, 
ex-RDF, by communicating applications.  (Just as the meaning of xsd:integer 
must be understood in your example.)

The purpose of including range/domain statements in the examples I gave 
previously was to make the equivalence between the different idioms 
explicit in the RDF.  For example, this:

    Bob ex:age _:1 .
    _:1 xsd:integer.map "30" .
    xsd:integer.map rdfs:domain xsd:integer.val .

RDFS-entails this:

    _:1 rdf:type xsd:integer.val .

without needing to refer to any special understanding of xsd:integer.map or 
xsd:integer.val.

>and presumably
>
>    xsd:integer.lex rdf:type s:LexicalSpace .
>    xsd:integer.val rdf:type s:ValueSpace .
>    xsd:integer.map rdf:type s:Mapping .
>    xsd:integer.cmap rdf:type s:CanonicalMapping .
>    xsd:integer s:lexicalSpace xsd:integer.lex .
>    xsd:integer s:valueSpace xsd:integer.val .
>    xsd:integer s:mapping xsd:integer.map .
>    xsd:integer s:canonicalMapping xsd:integer.cmap .
>
>and we still haven't actually said anything about
>the data type xsd:integer itself, but have to "know"
>about how to parse the special names and remove the
>'.lex', '.val', and '.map' suffixes.

Whoa!!  Who said anything about having to *parse* the special names?  I see 
them as just a convention for introducing datatype names so that the 
discussion is easier to follow.  *Any* name could be used, as long as a 
common meaning is understood by applications that exchange information.  So 
some name has to be picked.

As for all the "extra" statements that you list:  these are just some 
relationships between the datatype-related names that *can* be expressed in 
RDF if required, not additional mechanism that *must* be employed.  The 
value of these are that they allow RDFS-entailments between the various 
data typing idioms that have been used.

>And again, with S we get into the fun stuff with the
>ability to say
>
>    Bob ex:age "30" .
>    ex:age rdfs:subPropertyOf xsd:integer.map .
>
>which given the above range and domain statements
>thereby declares Bob to be an instance of
>xsd:integer.val, etc. etc.

Yes, and I don't believe that any useful solution to the data typing issue 
can prevent people from making statements with silly consequences - i.e. 
that don't correspond to our understanding of reality.  As more expressive 
constructs are layered on RDF that allow us to more fully describe our 
understanding of reality, such statements will likely result in 
contradictions in the knowledge base;  as we know, basic RDF isn't 
expressive enough to express contradictions, but that doesn't mean it's not 
capable of expressing things with silly interpretations.

> >From what I have understood, none of P, P+, D, or U make literals
>anything other than they are as defined in the Dec14 draft of the MT.

Well, P has the denotation of a literal-labelled graph node be in the 
datatype value space, and further allows that different graph nodes labeled 
with the same literal may denote different values.  That is a fair amount 
of extra complexity in the model theory.

Anyway, my point of focusing on your examples was to illustrate that, in 
terms of RDF usage, the various idioms discussed are not so dissimilar in 
what they express.  Further, the choice of S as a basis does not preclude 
the idiom you appear to prefer.  (I think much of the complexity you see in 
S is a misunderstanding of its basic approach, which is very simply to say 
that the denotation of a literal-labelled node is the literal label value, 
and RDF itself is used as required to describe nodes that denote the 
corresponding value.)  The proposals differ more fundamentally in the 
model-theoretic treatment of the denotation of literal-labelled nodes, and 
here S has a clear virtue of simplicity.

#g


--------------------------
        __
       /\ \    Graham Klyne
      /  \ \   (GK@ACM.ORG)
     / /\ \ \
    / / /\ \ \
   / / /__\_\ \
  / / /________\
  \/___________/
Received on Thursday, 10 January 2002 18:33:33 UTC