That is an excellent RDFCore braindump Jeremy, thanks!

Even at that time N3 was really helpful, at least to me,

to nail things down and I mean such attempts as in

http://eulersharp.sourceforge.net/2003/03swap/rdfs-rules.n3

http://eulersharp.sourceforge.net/2003/03swap/xsd-rules.n3

 

--
Jos De Roo, AGFA http://www.agfa.com/w3c/jdroo/

 
Jeremy Carroll <jjc@hpl.hp.com>
Sent by: semantic-web-request@w3.org
2007-08-02 20:36

To: Story Henry <henry.story@bblfish.net>
cc: Sandro Hawke <sandro@w3.org>, Lee Feigenbaum <lee@thefigtrees.net>, Richard Cyganiak <richard@cyganiak.de>, Garret Wilson <garret@globalmentor.com>, Tim Berners-Lee <timbl@w3.org>, Semantic Web <semantic-web@w3.org>
bcc: Jos De Roo/AMDUS/AGFA
Subject: Re: RDF's curious literals
 

At some level this thread is rather futile.
The RDF design includes a design for representing numbers, amongst other
things.
This is now fairly well deployed with interoperable implementations.

Garret doesn't like this aspect of the design.

Well, that's life.

All aspects of agreements between numerous people involve aspects that
some people dislike. It is particular irksome when, for some reason, we
end up participating in an aspect of the world which other people agreed
on, and we are too late to the party to argue against something we don't
like.

I think it may be less futile to give some sort of design rationale.

There are two approaches:
- give an historical account of how we got to where we are
- give a more abstract account of the problem space, and see which
aspects of the current design are essentially inevitable.

I'll try the latter - the former is available in the mail archives of
the RDF Core WG.

=====

RDF is intended as a way of describing things.

Most of the things being described, and the means to describe them, are
identified by URIs. However, URIs are non-rigid designators, i.e. it is
not always clear what a URI is intended to represent. The RDF Semantics
is written with the weakest possible assumption that each URI represents
something, but we don't know what.

It is also helpful to have some aspects of the descriptions using rigid
designators, where what they represent is known in advance. In RDF these
things are called literals. Initially the only sort of literals were
strings. This was fairly limiting, and there was a desire to include
other datatypes, such as those defined by XML Schema

Given that we wanted to have an open framework, which wasn't limited to
just the XML Schema datatypes, we decided that the author of an RDF
document could use whatever datatype they wanted; although we did not
define a means by which they could declare new datatypes, but require
private agreement for new datatypes. If there was a call to fix this, it
could be done.

To allow anyone to introduce there own datatypes we used the notion of a
datatype URI to identify the datatype being used. I think this is highly
defensible design decision.

Since the point of having literals is to have things whose
interpretation is known, the datatype acts as the means by which that
interpretation is defined. Hence a datatype has a lexical-to-value mapping.

To provide a useful set of datatypes, we use the XML Schema datatypes,
identified by the URIs given by the XML Schema WG.

As many people have pointed out the abstract syntax is an abstract
syntax. It is not intended to limit the way that RDF is written down,
nor is it intended as the meaning of an RDF document. Thus in the
abstract syntax a typed literal is represented as a pair: the datatype
URI and a string. In RDF Semantics this is then mapped to the specific
value as given by the datatype. Having such predefined designators is a
fundamental requirement for being able to use known values in
descriptions of resources, which was one of the goals of the literal design.

Moreover any design which allows arbitrary user defined datatypes ends
up needing something like a URI to represent the datatype, and something
like the lexical form to represent the string representation of that
value: at least at the abstract syntax level. You are free to write that
pair however you like, including omitting the datatype URI and the
quotes around the string, as long as in the syntax you are using they
are superfluous, and then they can be (logically) put back into the
abstract syntax.

====

There were other design options we considered, but they all included the
notion of a datatype URI and the notion of a lexical form, the notion of
a lexical to value mapping, and the value space.

Garret's proposed design also seems to include these - except that the
datatype URI is used as URI prefix, and the lexical form is used as a
suffix. This seems to require analysis of the internals of a URI in
order to identify what it means, and I prefer the designs where these
components are separated.

Jeremy









--
Hewlett-Packard Limited
registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England