- From: Garret Wilson <garret@globalmentor.com>
- Date: Tue, 31 Jul 2007 16:49:05 -0700
- To: Story Henry <henry.story@bblfish.net>
- CC: Tim Berners-Lee <timbl@w3.org>, Semantic Web <semantic-web@w3.org>
Story Henry wrote: > > I think the point of Tim's post was that really he thinks that the > only things that are literals are strings. > > [] xsd:integer "123" . > > can be written in shorthand as > > "123"^xsd:integer . > > (see the N3 tutorial) > > Because everybody in the xml world was going crazy about xsd datatypes > they wanted this more complex notion of literals. To satisfy them the > ^^ notation was added. > > 123 = "123"^^xsd:integer, > "123"^xsd:integer > 123 . > > is a nice shortcut in N3. Ah, I see---and sorry for just getting up to speed on N3. So that's why you kept using xsd:integer the way you did, and I kept objecting to it. I wanted to use xsd:integer as the type of the resource (and I still do). You were wanting to say, "if we consider xsd:integer to mean something like Integer.toString() in Java, and then we can refer to the resource 123 by saying, "the resource for which Integer.toString() yields '123'." But regardless of this notation, I still stick to all my original points. > > You may have a good point there. There is a difference between a > string and a URI though that is worth keeping in mind. Related to this discussion, there is only one difference between a URI string and a non-URI string: a URI string has an internal structure that allows you to apportion of sections to be managed by a third party (in this case, ICANN and the IETF) so that you don't have name clashes. Note that you can still have name clashes when people decide to run parallel DNSs---it just doesn't happen that often. And you still have semantic clashes when you have non-normalized forms of UTF-8 encoded Unicode being used. But overall, URIs help prevent string clashes. And that's the only difference. If we decided that only people with last name starting with the letter B could manage strings starting with the letter B, we would have an analogous situation (although there would be more clashes). The string "George W. Bush" when identifying the US president can clash with the same string if someone decided to name their pet pig "George W. Bush". But if I prepend the string with "http://example.org/us/presidents/", I reduce clashes and you're happy. But I've simply prepended the string with something to reduce the environments in which "George W. Bush" can refer to different things. But it's still a string---it just conforms to the URI syntax and has extra preceding characters. So "313" is a string, and I can use it to identify the value 313. But I can also use it to identify the characters on the license plate of Donald Duck's car. So to prevent the confusion, I could prepend characters to make longer strings: "(this is a string dammit)313" and "(this is a number danmmit)313". And we don't have clashes anymore. But that's certainly a nonstandard syntax. I could use a standard format of a string called URI (hey, RDF already uses URIs for all other resources, so that comes in handy!) and use perhaps <rdfliteral:313;xsd:string> and <rdfliteral:313;xsd:literal>. But I'm still using strings---just strings with structure. So let me state this another way: to say that non-literal resources use URIs as identifiers and literal resources use strings as identifiers is a false dichotomy. RDF uses strings for all its identifiers. It's just that for non-literals, these strings conform to a format called URI so as to reduce clashes. There's no reason why we can't have the strings identifying literals conform to this same format as well by pre/postpending the appropriate information--- perhaps a "rdfliteral" prefix and a ";datatype" postfix. Then we use URI-conforming strings for everything. > Still it is convenient to have literals, you have to admit. Because > when you see one, you know how to deal with it immediately. And we are > engineers, so we do like to have some conveniences. I have no idea what this statement means. I think it is convenient to have integers. I think it is convenient to have strings. I think it is convenient to have boolean values. When I see any of them, I know how to deal with with them automatically. But what does this have to do with literals? You may be saying, "when I see the strings '123', '\"123\"', and 'true', I know instantly that these are an integer, a string, and a boolean value." (Now you're back to JSON.) And that's fine---have your processor automatically turn "123" into an integer, "\"123\"" into a string, and "true" into a boolean. But that doesn't mean that I should get a special thing called a literal in my data model. I should get three resources in my data model, perhaps identified by URIs <rdfliteral:123;xsd:integer>, <rdfliteral:123;xsd:string>, and <rdfliteral:123;xsd:boolean>. If you want to have a serialization format and/or a library that gives you special shortcuts for working with integers, strings, booleans, or even US presidents, that's great. But none of those shortcuts should affect the model. > >> But wait---if I decide that it's easier to represent this resource >> using a string, I could create the resource "George W. >> Bush"^^foaf:Person. > > Please have a closer look at N3, or else we will keep repeating the > same points. Point well taken regarding the meaning of ^^ in N3... ...but please note that my point here is that there is no need for rdf:datatype or some odd ^^ indirection (or some special xsd:integer property that some anonymous subject has) if literals are just resources. In the example above, I'm saying that I want an rdf:type of foaf:Person for George W. Bush, and that identifying him by a string shouldn't change how he appears in the graph. > Also to refer to people via a String is not helpful, because they can > have different names. Since the > following is necessarily false > > "George W. Bush" = "George Bush" . But that's just one particular domain (which I could rectify by using an rdf:datatype with a controlled lexical vocabulary for US presidents), and it's missing my point. Let's talk about planets in our solar system. I can identify two planets, <eg:Planet rdf:about="http://example.com/planets/mars"/> and <eg:Planet rdf:about="http://example.com/planets/uranus"/>, and "Mars"^^eg:planet and "Uranus"^^eg:planet. These both identify the same two planets, but why is Uranus (hee hee) a normal resource in one and a literal in the other? Why does the first form have an rdf:type and the second form have some sort of odd indirect-reflexive rdf:datatype? This shouldn't be the case. > That is ok. A person can have two name relations different strings. > URIs are Universal Names. Every ordered pair (lexical form, rdf:datatype URI) is a Universal Names, too, just as much as a URI is. Let me state the whole case differently: URIs don't clash because they have some sort of domain specifier (hey, it's even called domain---fancy that!) prepended to the string (e.g. "http://example.org/" prepended to "string"). RDF typed literals don't clash because they have a *separate* string (an rdf:datatype URI) that is a domain specifier for the string. So there is no difference between URI domain+string and rdf:datatype+lexical form. So why not combine the lexical form and the datatype, resulting in a URI, and bring literals back into the resource fold? Garret
Received on Tuesday, 31 July 2007 23:49:18 UTC