- From: Garret Wilson <garret@globalmentor.com>
- Date: Tue, 31 Jul 2007 14:44:06 -0700
- To: Tim Berners-Lee <timbl@w3.org>
- CC: Semantic Web <semantic-web@w3.org>
Tim,
Thanks for the reply on RDFON. I accept that the RDF proposal was
ignorant of the latest N3 syntax, and although I still prefer something
RDFON-like, your points were valid and there's no merit in my trying to
advance RDFON as "better" at this stage (or perhaps at any stage). And
just as I have only recently been made aware that RDF (despite RDF/XML)
allows literals in lists, your note that RDF allows literals to have
properties (also in spite of RDF/XML limitations) also came as a
surprise to me. RDF/XML is surely one of the worst things to happen to RDF.
The other is literals. Before replying to your comments below, let me
just step back and make a few observations and ask a few questions
concerning literals in RDF. (By the way, when I say "you", please
understand that I'm speaking to a hypothetical responder or the general
RDF user, not necessarily you (Tim), or anyone else.)
1. How is a literal any different than a resource? The RDFS definition
at http://www.w3.org/TR/rdf-schema/#ch_literal is at first circular
("the class of literal values") and then nonexistent for some literals
("This specification does not define the class of plain literals."). The
RDF Primer explanation at
http://www.w3.org/TR/rdf-concepts/#section-Literals is more helpful: A
literal is a resource identified by a lexical representation, which
representation may be "more convenient or intuitive" to use instead of a
URI.
So at the end of the day, a literal is simply a resource that is easy to
refer to using a string of characters. That's all well and good, but why
should that affect my model? Why is a resource an instance of another
class (rdfs:Literal) just because I like to identify it by a lexical
representation?
Take for example the resource identified by the URI
<http://example.org/presidents/GeorgeWBush>. This resource may have an
rdf:type of foaf:Person. (I could assert all sorts of other RDF
statements about this resource, but will decline to do so at this time.)
Is this resource an instance of the class rdfs:Literal? No? Why not?
But wait---if I decide that it's easier to represent this resource using
a string, I could create the resource "George W. Bush"^^foaf:Person.
Suddenly George W. Bush (the person, without the quotes, just as 123 is
the resource represented by "123"^^xsd:integer) is an instance of
rdfs:Literal. Why? Why did my model change? How did the world I was
modeling change just because I decided to represent George W. Bush using
a string?
So let me go back to my original question: How is a literal different
from a resource? My answer is that there should be *no* difference. The
only difference is a syntactical matter of identification---but that
should *not* give rise to a new class of resources. There should be no
such thing as an rdfs:Literal. Everything should be a resource, however
we decide to identify them. (Think of how absurd it would be to have an
rdfs:Anonymous class, for all resources that are identified neither by a
URI nor a lexical representation!)
2. How is rdf:type different from rdf:datatype? This is where RDF's odd
treatment of literals starts to get stranger. If I describe the resource
identified by URI <http://example.org/presidents/GeorgeWBush>, I can
give this resource an rdf:type of foaf:Person. But if I describe this
same resource using a lexical representation, I give it an rdf:datatype
of foaf:Person (yielding "George W. Bush"^^foaf:Person). Why? It's the
same resource---I just found it "more convenient" to identify it with a
string.
The same thing goes for the resource 123, identified by the string "123"
with data type xsd:integer. This resource should have rdf:type
xsd:Integer. Why does it have a separate xsd:datatype? One answer could
be that "rdf:datatype is to specify the transformation between the
lexical representation and the actual resource." Fine, but that has two
problems: the rdf:datatype sticks around in the actual model, when its
user is merely syntactic; and I still don't get an rdf:type, which the
number 123 surely has (just like George W. Bush surely has an rdf:type
of foaf:Person, even if I refer to him using a lexical representation).
There should be no rdf:datatype. Its usage is partly syntactic; the
other part is made redundant by rdf:type.
3. If you want to refer to a resource using a lexical representation,
RDF should create a URI scheme for lexical representations---then we
could simply refer to all literals by URIs and be done with it. One
method would be to use the form <rdfliteral:literal;datatype>, such as
<rdfliteral:123;xsd:Integer>. I frankly don't care what the format of
this URI is, but the mapping is straightforward. A URI is a glorified
string---there's no reason to use *other* strings to identify resources.
"But I want to simply use a string, not a URI, in my serialization of
choice," you say. Fine, but that's a serialization issue. If you want to
use "123"^^xsd:integer in N3 and have your parser automatically generate
a node with URI <rdfliteral:123;xsd:integer>, then so be it. But there's
no reason to have a different type of resource created just because you
like to use string shortcuts, and there's no need to query these beasts
differently just because you like writing "123"^^xsd:integer instead of
<rdfliteral:123;xsd:integer>.
I'm all for syntactical shortcuts. In fact, I would make it even easier
for you: I think if you write "abc", the processor should automatically
change this to <rdfliteral:abc;xsd:string> for you. But that shouldn't
change the model or make some sort of odd literal class. They are all
resources goshdarnit! All!
So let me add a few quick responses below to clear up a last few things:
Tim Berners-Lee wrote:
> I agree that thinking of an integer as a Resource is fine, in that 123
> is a Thing, like everyThing else.
It's more than just "thinking of an integer as a Resource". An integer
is a resource, no? How is George W. Bush more of a resource than the
number 3?
>
> That does not mean we should symbols and literal values in the language.
I think you left out a "not" or something, but let me restate this:
"That does not mean we should not use symbols and literal values in your
serialization language of choice. But it shouldn't change the RDF model."
> I think it is fine to have 123 (note no quotes) as literal in n3,
> which it is.
I think it is fine to have 123 as a resource. It shouldn't be a literal.
So I can represent it as "123" in English, or "١٢٣" in Urdu. Big deal. I
can represent George W. Bush as "George W. Bush" in English. Nothing
about these true statements changes the type of resources we're dealing
with.
> I think it fine to say that that sequence of character sin the
> labguage a identifies the number 123, which is a member of the class
> of Integers, much as a URI identifies another reseource.
Right. That's a statement of syntactic transformation. Let's keep it
down in the serialization, not in the model.
> I think in fact also its fine to make URIs and say they also represent
> the number 123, e.g.
I agree with that statement on its face.
> I don't, however, think it works to have rdf:about as a single
> property (or even XML attribute) relating
> 123 to the string "123".
Here's where I was misunderstood. I made all those eg:IntRepresentation
examples in another message to illustrate that the lexical
representation is distinct from the resource itself. I'm violently
agreeing here: I don't want to relate 123 with "123" at all, except for
using "123" in your serialization of choice to somehow get to the
resource 123 if you like.
> For example, suppose we want to model octal numbers and decimal numbers.
> I much prefer to concentrate on the number 123 as an Integer, and have
> separate properties decmal and octal
> relating it to different strings, than to imagine separate classes of
> Decimal Integer and Octal Integer.
Completely, completely agreed.
>
>> And (finally) going back to
>> RDFON, we see that eg:datatype("value") is really just
>> instantiating an eg:datatype class with a lexical identifier
>> instead of a URI identifier.
>
> If you look at that as an object initialization function, then that
> maps to a binary predicate which is my model above. I prefer very much
> to have a datatype-specific one such as dt:decimal.
I don't quite understand what you're saying here. In RDFON, I would have
xsd:Integer("123") map to:
<rdf:literal:123;xsd:Integer> rdf:type xsd:Integer
>
> One more note on datatypes. In practice the term in the RDF abstract
> language which N3 writes as 123 and NTriples writes as
> 123^^xsd:integer I model as [ xsd:integer 2] or 2^xsd:decimal, in
> practice is stored in RDF stores typically as some object like
>
> {termType: 'literal', value: "123", dt_URI: "http://...integer", lang:
> null }
>
> This is a term in the language. It isn't the resource 123.
But 123 was the resource I was trying to identify. Why doesn't it map to
the triple I show above ( <rdf:literal:123;xsd:Integer> rdf:type
xsd:Integer )? Isn't that simpler? Doesn't that reflect what I
indicated? Doesn't it identify a resource that I can give properties to
and put into lists---even in RDF/XML?
Death to literals, rdfs:Literal, and rdf:datatype. Long live resources.
Garret
Received on Tuesday, 31 July 2007 21:44:12 UTC