Re: datatypes and MT from Sergey Melnik on 2001-11-02 (w3c-rdfcore-wg@w3.org from November 2001)

From: Sergey Melnik <melnik@db.stanford.edu>
Date: Fri, 02 Nov 2001 11:22:43 -0800
To: Pat Hayes <phayes@ai.uwf.edu>
CC: Brian McBride <bwm@hplb.hpl.hp.com>, w3c-rdfcore-wg@w3.org
Message-ID: <3BE2F283.5F35C212@db.stanford.edu>

Pat Hayes wrote:
> ...
> >>Now, this seems to me to have a fatal flaw, which arises from the
> >>fact that the value spaces of two different datatypes might
> >>overlap. For example, suppose that there are datatypes xxd:octal
> >>and xxd:decimal, then the following would seem to be perfectly true:
> >>
> >>_:1 rdf:type xxd:octal
> >>_:1 rdf:type xxd:decimal
> >>_:1 rdf:value "32"
> >>_:1 rdf:value "26"
> >
> >
> >But that is not how Sergey would write it.  He is proposing:
> >
> >   _:1 xxd:octal   "32" .
> >   _:1 xxd:decimal "26" .
> 
> Oh, I see. That does indeed avoid this problem, but it also throws
> away the advantages of the bnode way of doing things, since now it is
> impossible to be neutral about datatypes.

Why? If you want to be neutral, use a bNode w/o any arcs attached to it
(if I understand what you mean by neutral).

> This forces the datatyping
> information to be attached directly to the literal;

Right.

> the only place
> literals can occur in an RDF graph is at the object end of links
> labelled with a datatype.

True, although I personally do not see any problem with allowing

"cat"  base64  "Y2F0"

either.

> This seems to me to be simply a variation
> on the idea of incorporating the datatype label into the literal
> itself, eg by having literals be pairs of a datatype and a string.

Not quite. Notice that we can refer both to the literal "typed" in such
a way, and its type by means of arcs in RDF graphs (and, for example,
provide additional information about the datatype or describe how the
literal/string is represented using the base64 encoding, if that's all
we know).

> Like that proposal, it forces datatype information to be given
> explictly and locally,

Yes, and I believe that's a strength rather than a weakness. I think it
is essential for the snippets of information dispersed on the Semantic
Web to be as precise as possible and as self-contained as possible.

> and makes it impossible to infer datatyping
> information from other information in the graph, eg range information.

I think it is still possible, although to a limited extent. You are
right in that we'd always need to link typed thingies to literals in
some form. However, range information and inference could still be quite
useful. For example, take a built-in xxd:decimal datatyping property, as
above. You could say:

xxd:decimal rdfs:domain MyReal

thereby naming the value space of xxd:decimal explicitly. Now MyReal can
be used for typing in a conventional way as in 

_x rdf:type MyReal

Of course, _x in the above could be any real, i.e., we still do not know
*which* decimal is meant. The link between _x and say (real)5 can be
established in many ways, if we have powerful schema languages. You
could, for example, describe somehow (using dedicated vocabulary) that
_x must be a prime between (real)4 and (real)6... Or you can write a
statement

_x xxd:int "5"

which would also disambiguate the interpretation of _x.

I hate going back to rdf:value, since it caused so much harm and
misunderstanding, but you could even use rdf:value along with a rule
like

(X rdf:value Y) && (X rdf:type MyReal)
  -> X xxd:decimal Y

subject to inconsistencies when rdf:value is used improperly, of course.

Sergey

Received on Friday, 2 November 2001 14:09:18 UTC