Re: RDF's curious literals from Garret Wilson on 2007-08-01 (semantic-web@w3.org from August 2007)

From: Garret Wilson <garret@globalmentor.com>
Date: Wed, 01 Aug 2007 08:51:45 -0700
To: Story Henry <henry.story@bblfish.net>
CC: Richard Cyganiak <richard@cyganiak.de>, Tim Berners-Lee <timbl@w3.org>, Semantic Web <semantic-web@w3.org>
Message-ID: <46B0AC11.2030503@globalmentor.com>

Story Henry wrote:
>
> So Richard is completely right here:
>
>  <http://dbpedia.org/wiki/George_W._Bush> refers to a real person
>
> whereas
>
> "http://dbpedia.org/wiki/George_W._Bush" refers to the string (in RDF 
> of course).
>
> That's a big difference.

I'm not disputing the statement above. But in the context of this 
discussion, it is missing the point altogether.

<http://dbpedia.org/wiki/George_W._Bush> refers to a real person. You 
say so above.

In RDF I could refer to the same person using 
"http://dbpedia.org/wiki/George_W._Bush"^^eg:uspresident, could I not? 
I'm not saying that anyone has created the eg:president datatype, but 
they could. And I could use the representation 
"http://dbpedia.org/wiki/George_W._Bush"^^eg:uspresident to refer to the 
exact same person as does the URI 
<http://dbpedia.org/wiki/George_W._Bush>. In RDF, this is possible, right?

So the only question I have is this: if they both refer to the same 
person, why does RDF make one of them an instance of some class called 
rdfs:Literal in the resulting RDF model? This is an anomaly. An 
inconsistency. It is an inconvenience that brings no value to the RDF model.

> How the machine stores that in the DB is up to it. For ints of course 
> it saves space to use the usual int length for your machine.

Your statement completely ignores the RDF data model, the most important 
part of the entire equation!

The layers look something like this:

[long-term storage format (e.g. DB)] [logical data model] [serialization 
format (e.g. RDF/XML, N3)]

I can use several types of files (RDF/XML files, text files with N3) to 
store RDF assertions. My processor can read that and then store it in 
some database for running queries. In N3, an integer might be 
represented by 123, and in the database it might be stored as an 
eight-bit value. That's all fine, and I agree.

But what is the model of the data being represented? What internal graph 
or series of statements does my N3 processor turn my 123 and "123" into? 
That's what I'm interested in, and that's what this whole discussion in 
about. In the model, George W. Bush should have a URI identifier whether 
I've used  <http://dbpedia.org/wiki/George_W._Bush> or 
"http://dbpedia.org/wiki/George_W._Bush"^^eg:uspresident to identify him 
in my N3. The same goes for the value 123. There should be no thing 
called an rdfs:Literal.

The storage format in a database is beside the point. The syntax used in 
N3 or SPARQL is beside the point. I'm talking about the RDF data model.

Garret

Received on Wednesday, 1 August 2007 15:51:49 UTC