W3C home > Mailing lists > Public > semantic-web@w3.org > August 2007

Re: RDF's curious literals

From: Story Henry <henry.story@bblfish.net>
Date: Wed, 1 Aug 2007 18:39:44 +0200
Message-Id: <195C2C6A-68C9-41A3-80D5-5CC2F35FD2EE@bblfish.net>
Cc: Jeremy Carroll <jjc@hpl.hp.com>, Tim Berners-Lee <timbl@w3.org>, Semantic Web <semantic-web@w3.org>
To: Garret Wilson <garret@globalmentor.com>

I think it is worth reading carefully Tim Berner's Lee's previous  
response. But here is my response to what are literals useful for  
(but I may be wrong).

  1. you say why not "George Bush"^^xxx:president, and so why is not  
everything a Literal?

    there are in fact limitations on what can be a Literal. Thanks  
for this discussion, because I did not quite understand why before.

  2. what is the use of "123"^^xsd:integer over <http://number.eg/123>

     I think Tim Berner's Lee answers that very well. But here are  
some extra thoughts again

     Because of the Open World assumption we accept that things can  
have a number of names. So it is difficult to tell when two things  
are or are not different. For example is

      <http://presidents.com/Bush/George> referring to the same thing  
<http://presidents.com/Bush/George/W> is? Well you can't tell. You  
may assume they are different until told otherwise

      <http://presidents.com/Bush/George/W> =  <http://presidents.com/ 
Bush/George> .

      Now for numbers we know that 123 is different from 1234 is  
different from 124 etc, etc. You can see how this could make for an  
absolutley humungous number of statements of differences in your  
database. Furthermore we *know* they are different without even  
looking at the world. There is no possible world in which they are  
the same. 123 is always different from 124. That can save a lot of  
calculations and storage. No need for the infinite number of statements

      <http://numbers.eg/123> owl:differentFrom <http://numbers.eg/ 
124> .

     The same is with "hello" and "bye": they are different strings  
just by looking at them. If we use URLs for each string, it becomes a  
lot more difficult to tell them apart. So you can see that it is  
useful to have a distinction here.

    Literals should be the types of things for which one can be  
certain just by looking at their name what they refer to.

    3. Saving space.

      "123"^^xsd:int does in fact save space in a db over <http:// 
numbers.eg/123> because you can can use a pointer to the url  
abreviated by xsd:int, and just have a different string each time.

      Same with unicode characters. "a"^^xsd:char would save a lot of  
space in the DB. Of course everyone just simplifies that down to a  
few bytes anyway...

Thanks for helping me think about this.


On 1 Aug 2007, at 18:10, Garret Wilson wrote:

> Story Henry wrote:
>> On 1 Aug 2007, at 17:18, Garret Wilson wrote:
>>> So now that I've agreed with your points, let me say that your  
>>> points beg certain questions, which perhaps you can answer:
>>>   1. Where in all of this is there a need for the rdfs:Literal  
>>> class?
>> You can make up any class of objects you want. rdfs:Literal is  
>> useful.
> Wait---can't I already make up any class of objects I want? I  
> thought that was what rdf:type was for.
> I guess I was looking for more of an answer than "rdfs:Literal is  
> useful"---you're just affirming the question. Let me state the  
> question again: assuming that your N3 processor and SPARQL engine  
> automatically turn 123 and "123" into URI references <http:// 
> www.example.org/integers/123> and <http://www.example.org/string/ 
> 123>, can you give me any real-life example in which the existence  
> of an rdfs:Literal class would bring functionality you don't  
> already have?
>>>   2. Where in all of this is there a need for rdf:datatype?
>> same as above, it's useful.
> Please, Henry, I'm being serious. Affirming the question isn't a  
> serious answer. Can you give an example of where rdf:datatype can  
> do something that I can't already do using URIs as identifiers and  
> the rdf:type property I use for all non-literal resources?
>>>   3. Where in all of this is there a need for a literal to be
>>>      identified *in the model* as something other than a URI?
>> None. You could do all with URLs but people would balk at you, and  
>> it would make writing things down really
>> difficult.
> Please, Henry, you apparently didn't even read my message! I'm  
> advocating writing 123 and "123"! (Hell, I balk at writing  
> "123"^^xsd:integer and "123"^^xsd:string!) Nobody is asking you to  
> write out a long URI.
> RDF doesn't store "123" in the model for typed literals---it stores  
> "123"^^xsd:integer or "123"^^xsd:string. If "123"^^xsd:integer and  
> <http://example.org/integers/123> are equivalent, and  
> "123"^^xsd:string and <http://example.org/strings/123> are  
> equivalent, why would you prefer a non-URI sequence of characters  
> over a URI for identification?
> This is not about writing out something short versus writing out  
> something long. Something long gets stored in the model anyway. Why  
> do we have to use a non-URI for literals? Why is "123"^^xsd:integer  
> better than <http://example.org/integers/123> ? That is the question.
>> In fact you could write an ontology out to do this if you wished.
>> <http://unicode.org/char/a> = "a" .
>> and so on for every character in the unicode.
>> Then you could write out a new model theory, and prove that your  
>> URI only model theory was a simplification of the current one, but  
>> that they were equivalent.
> Henry, this is jumping from apples to oranges, and it's unfair.  
> Nowhere in our discussion were we referring to individual Unicode  
> characters, either as subjects or object values.
> But sure, let's assume that you want to say that my name  
> eg:startsWith the letter "G"? Even you would say that this turns  
> into N3 something like this:
> <uriForMyName> eg:startsWith "G"^^unicode:char.
> That's what you would recommend, right? So how is "G"^^unicode:char  
> more ridiculous than <http://unicode.org/char/G> ?
> You were trying to make my argument look absurd by talking about  
> something I was never talking about: referring to Unicode  
> characters. When we *do* talk about referring to Unicode characters  
> individually, the RDF literal method looks just as ridiculous as  
> anything else.
> Garret
Received on Wednesday, 1 August 2007 16:55:00 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:41:58 UTC