- From: Tim Berners-Lee <timbl@w3.org>
- Date: Tue, 31 Jul 2007 13:46:57 -0400
- To: Garret Wilson <garret@globalmentor.com>
- Cc: Semantic Web <semantic-web@w3.org>
- Message-Id: <64C173BF-202B-401D-B85B-EBE441FACD50@w3.org>
On 2007-07 -26, at 19:53, Garret Wilson wrote: > > [..] > Although I've used RDF since the early days, I seem to have skipped > a lot of the advancements made in serialization as my focus was > distracted over the past few years. A shame :) > Sure, it appears that the difference between RDFON and N3 are > negligible from a technical point of view, so I don't want to argue > that one is better or worse than the other. I would like to point > out a few merits of RDFON to think about. The cost, though, of diverting the efforts of programmers like yourself away from supporting the common formats if high. I'm not saying that one should never fork off a new format, or there would not have been an N3. But now the need has been met and we need a common language not only for code but for people on IRC and in tutorials and in presentations. One can pretty much expect technical sem web folks to grok a slide in N3 nowadays > > * RDF is built upon an assumption of propositional logic and > category theory that many programmers aren't used to. Sure, it's > the same philosophy underneath (going back to Wittgenstein, > Russel, Frege, and even Aristotle), but many programmers think in > different terms. N3 thinks, "I'm making an assertion regarding a > particular subject, predicate, and object, so I therefore list > the > three parts of the proposition being asserted." 95% (I would > guess) of today's programmers would think, "I'm assigning a value > to the property of an object." No, I think the model for this information is data, not program. And in fact the JSON code is not program, is it JS object. JSONs { x: "3", y: 5} i snot assignment but a data structure. N3 very much matches this. It omits the ":", yes, which actually saves a lot of time. It also has the comma which is very naturally english-like and also saves a lot of code. > So while N3 has "eg:childCount 2;", > a procedural programmer sees something wrong here---he or she > will > breath a sigh of release when he/she sees that extra delimiter > character separating the property from its value: > "eg.childCount:2". This is a teeny, tiny issue, but I think the > difference in mindset between "setting object properties" versus > "asserting propositions" is a significant one, even though they > both mean the same thing. Well, I'd like you to make a point of using N3 for a month and see whether you still feel that way. By the way, in the original language, there were those delimiters, but they were optional and people never used them. You could write Alice -- child -> Bob and Alice has child Bob, but now we say Alice child bob. > * Along the same lines, when a common programmer looks at > "<urn:uuid:ca00db92-0f7f-434b-b254-8a6afcf57907> a > <http://xmlns.com/foaf/0.1/Person>" he/she won't know what the > heck is going on. When he/she looks at > "foaf.Person(<urn:uuid:ca00db92-0f7f-434b- > b254-8a6afcf57907>)", on > the other hand, he/she will think, "Ah, an instance of the class > foaf.Person is being instantiated, and is being initialized with > the URI <urn:uuid:ca00db92-0f7f-434b-b254-8a6afcf57907>, which > must be its ID." Of course, that's not *exactly* what's going on, > but at its core it's saying the same thing---and we've made it > easier for the common programmer to grasp what's going on. > (Frankly, there are a lot more common programmers out there than > there are propositional logicians.) I don't think programmers need to see object creation. And object creation is in fact a misleading metaphor, suggesting that the number of slots os fixed in advanced. So wile it is useful to help people along with a metaphor, it also piles up issues for the future, making it more difficult I think for people to unhook from the closed world assumption and realize that any document can say anything about anything. > * So let's talk about literals, now. No one take offense, but I > personally have always thought that "value"^^eg:datatype is just > plain ugly. And it shows that RDF never quite knew what to do > with > literals, especially not typed literals. If everything is a > resource and everything can be described by a three-part > proposition, what is eg:datatype? Is it a property of the literal > "value"? I think a datatype -- or a unit -- is best modeled as a property relating the data value to the bare number. :bed :length [ si:meters 2]. or using path notation :bed "length 2^si:meters. This has all kinds of nice properties, like si:hz owl:inverseOf si:seconds. So when the RDF folks wanted an NTriple notation for a datatype built- in to a value, I suggested ^^ as analogous to ^. > But literals can't have values. That's a bug IMHO in RDF/XML. I understand it isn't true in SPARQL. It isn't true in full N3. > And surely the literal > "value" cannot have multiple conflicting datatypes. And how is an > xsd:string different from a plain literal? In RDF 2.0, I would > like to move literals away from their strange quasi-resource > status so that *everything* is a resource, For Real. An integer > such as 123 is a resource, and it might be *defined" by the > sequence of characters "123", I agree that thinking of an integer as a Resource is fine, in that 123 is a Thing, like everyThing else. That does not mean we should symbols and literal values in the language. I think it is fine to have 123 (note no quotes) as literal in n3, which it is. I think it fine to say that that sequence of character sin the labguage a identifies the number 123, which is a member of the class of Integers, much as a URI identifies another reseource. I think in fact also its fine to make URIs and say they also represent the number 123, e.g. ex:bicycleWheels = 2. (= is owl:smeAs in N3) I don't, however, think it works to have rdf:about as a single property (or even XML attribute) relating 123 to the string "123". For example, suppose we want to model octal numbers and decimal numbers. I much prefer to concentrate on the number 123 as an Integer, and have separate properties decmal and octal relating it to different strings, than to imagine separate classes of Decimal Integer and Octal Integer. > but that's different than saying > that "123" is a resource with a data type of xsd:integer > (which is > not true). I'd like to see something like <xsd:integer > rdf:literalAbout="123"/>, where "123" plays an analogous role to > rdf:about. This is the same thing has saying <rdf:Description > rdf:type="xsd:integer" rdf:literalAbout="123"/>. Suddenly we get > typed literals (distinct from strings!) that can take properties > and appear in lists with no problem. The fact that literals can't appear in lists is clearly a bug. So I think we should have an RDF 1.2 to fix that ASAP. > And (finally) going back to > RDFON, we see that eg:datatype("value") is really just > instantiating an eg:datatype class with a lexical identifier > instead of a URI identifier. If you look at that as an object initialization function, then that maps to a binary predicate which is my model above. I prefer very much to have a datatype-specific one such as dt:decimal. One more note on datatypes. In practice the term in the RDF abstract language which N3 writes as 123 and NTriples writes as 123^^xsd:integer I model as [ xsd:integer 2] or 2^xsd:decimal, in practice is stored in RDF stores typically as some object like {termType: 'literal', value: "123", dt_URI: "http://...integer", lang: null } This is a term in the language. It isn't the resource 123. Many literal terms can identify the same Integer. 0123 is one, for example, and 00123 another , not to mention the decimal 123.0 and the octal etc etc. It is tempting in built-in functions to add a primitive for accessing the datatype from the term. But the you can't access the datatype of an Integer. At the raw RDF graph level, then you could write { ?x rdf:datatype xsd:Integer } => { ?x rds:label "Should be an Integer." }. but once a system has any notion of the trivial inferences around datatypes, then that becomes an inappropriate question: ?x might have been calculated as the sum of two numbers, and many terms could represent it. So when a datatype is a relationship between a typed value and an untyped string, then a typed value does not "have" a unique datatype. ((You could say { ?x xsd:Integer ?y } => { ?x rds:label "Should be an Integer." }. meaning "If there is a ?y such that ?y if the representation of ?x as an integer then label ?x as an integer" )) > It makes sense to common programmers > using RDF 1.x, and it points to where I want to go with RDF 2.0, > in which everything is a resource, For Real, and a typed literal > is not just a kludge on top of a kludge. > * Common programmers expect a comma in a list such as > ( "Smith" "Van Buren" ). Perhaps propositional logicians do not. I agree. ((Well, anyone who has ued S-expression in any form have come to accept Lots of Irritating Silly Parens as some put it, and would be driven crazy by a bunch of commas. :) Note the comma is used in N3 for repeated objects, where I find it very natural. So one idea was to use commas to express lack of ordering, ie make a set a comma-separated list, and an ordered set space-separated. This made the amount of look-ahead in the language too great. )) These syntactic decisions are about so many compromises about using defferent communities' metaphors in the langauge desig, and subjective notions of cleanliness. Some of the design decisions made in N3 design are noted, with the alternatives and some arguments fro an against, in http://www.w3.org/DesignIssues/N3Alternatives The idea of properties linking a string and the string interpreted, for langauge and datatypes, is described in the 1998 article (maybe dated in parts) DesignIssues/InterpretationProperties.html -- see specifically DesignIssues/InterpretationProperties.html#Interpreta1 which was I think added in 2001. There may have been propositional logicians involved in the design, but a core value of expressing data as in data formats, tab-separted text, etc. Tim
Received on Tuesday, 31 July 2007 17:47:20 UTC