- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Thu, 13 Jun 2013 13:01:56 -0400
- To: public-rdf-comments@w3.org
- Message-ID: <51B9FB04.9020400@openlinksw.com>
On 6/13/13 11:16 AM, Gregg Reynolds wrote: > ... from a Concerned Citizen. For what it's worth, I'm quite familiar > with RDF but have not been following the various relevant WGs for some > time and only just got around to reading the JSON-LD draft, mainly > because I happened to notice the recent discussion about whether RDF > should or should not be mentioned etc., so I'll regale you with my > impressions in hopes they might be useful. > > [P.S. It turns out I have a specific idea for satisfying both pro- > and anti-RDF camps, see below.] > > First impression: where's the RDF? I was expecting to see something > in the non-normative sections explaining or demonstrating how JSON-LD > maps to RDF or vice-versa. Instead all I find is what amounts to a > couple of footnotes. Which would have left me perplexed - what is > this beast? - had I not seen the discussion about RDF phobia etc. > > Example 1: > { > "name": "Manu Sporny", > "homepage": "http://manu.sporny.org/", > "image": "http://manu.sporny.org/images/manu.png" > } > "It's obvious to humans that the data is about a person whose name is > 'Manu Sporny'..." > > This is plainly a false claim. I see a set of three ordered pairs, > and I see no reason whatsoever to think that such a set is "about" > anything. If I'm told that it is about something and am asked to > guess what, there's a pretty good chance that "a person named 'Manu > Sporny'" is the last thing that would come to mind. It seems much > more likely that I (in my "Everyman" hat) would say it's about the > homepage or the image of said Manu. On the other hand, knowing about > RDF as I do, I see why the claim was made. Which strongly suggests > that RDF is after all central to JSON-LD. > > And this is the crux of the matter: it's all about aboutness. More on > this below. > > I also expected to see some kind of translation from JSON-LD > expressions to triples and found it annoying that this was not the > case, since it left me continually wondering if I was > misunderstanding. After all, if it's supposed to "work" for RDF, but > it pointedly excludes talk of RDF, well, maybe it's supposed to be > something else - what? In other words, omission of RDF-talk is not > just an expression of accomodation the the RDF-phobes, it's an > expression of (mild) hostility to RDF-philes. At least that's how I > take it. > > Another thing that jumped out at me: @type. Is that rdf:type? Sure > seems like it ought to be but I can't really tell without spending > time and energy analyzing. Seems to me the spec ought to save me the > trouble by explicitly describing how the JSON-LD stuff relates to the > RDF stuff. > > There are a number of typos, grammatical errors etc. that I'll list in > a separate message. > > More generally, in light of the LD v. RDF struggle: I get the distinct > impression that in trying to satisfy the RDF-phobes, the WG has thrown > the RDF-philes under the bus. > > Even more generally regarding LD, RDF, etc.: in my view there is some > deep confusion in the land about "Linked Data". I notice in a number > of places (in the discussions on the list and minutes of teleconfs) > that people make claims to the effect that linked data - er, Linked > Data - is just HTTP IRIs that are dereferencable. An indirect example > from one of the messages: > > "IMHO, RDF != Linked Data. Nothing in RDF requires IRIs to be > dereferenceable ..." > > The clear implication being that dereferenceability is what demarcates LD. > > But then we also have claims like the following (from > http://json-ld.org/minutes/2011-07-04/#topic-3): "Linked Data is used > to represent a directed graph, and within the context of Linked Data, > the graph can be represented as connections between different nodes, > nodes are subjects and objects, links are properties. Nodes may have > identifiers that are URIs allowing them to be externally addressed." > > Note: no mention of dereferenceability as a criterion of demarcation. > > I think one problem is a clash or at least lack of clarity regarding > the relation of formalisms to pragmatics. You don't need the web to > describe graph structures. You do need the web to have dereferencing. > > A related problem is that lots of people seem to take "Linked Data" to > refer to a kind of data - the dereferenceable kind - that can be > "defined" as such. The four items in TBL's original design note on LD > are then taken as definitional of what LD is. This a mistake. First > of all, TBL's note is explicit: those four items are "expectations of > behavior", or as I would put it, descriptions of normative practices. > Second, and more critically, dereferenceability CANNOT be used to > define a kind of data in isolation. It is not a property of data, > it's a variety of data use. It's probably better construed as a > system property (although that's not entirely right). If it were a > property of data, then LD would cease to be LD as soon as the server > is taken offline, or the client loses network connectivity. You would > also be able to tell if a datum is LD by looking at it (rather than > using it). Treating dereferenceability as definitional of LD confuses > matters of fact with norms of practice. It also tends to lead to > quasi-metaphysical debates involving claims of the form "but LD is/is > not xyz", or "but RDF is/is not LD" (or vice versa). But it's not > about metaphysics, it's about pragmatics: what you do with the data, > how you treat it. > > Just to be clear: if you write a Java program that violates the > syntactic rules of the language, you have not written a bad Java > program, you written something that is not a Java program. But if you > publish (or claim to publish) LD without providing for dereferencing > of the IRIs (for example), you have published bad LD, not something > other than LD. Or perhaps it would be more accurate to say you have > made an unwarranted claim. That a program is not Java is provable - > it won't compile - so the truth of the claim is decidable and > categorical - yes or no. That some LD is bad isn't really provable in > that sense, since the web changes - the claim can be contested but not > decided by proof. Plus lots of data will mix dereferenceable and > non-dereferenceable IRIs, and HTTP and other schemes. > > From this perspective, the first paragraph of the intro should be > rewritten. First, Linked Data is not a technique, it is a set of > normative practices. "Technique" implies (in my opinion) procedure, > algorithm, or law-like rules that necessarily lead to correctness, > which is not what LD practices are (you can't guarantee > dereferenceability, for example.) Second, mentions of Linked Data > "properties" should be removed, or replaced by mention of practices, > norms or the like. > > Now you might just say "so what?" Is there any real harm in treating > LD as a definite kind of data rather than norms for using data? Maybe > not, in the grand scheme of things, but in addition to the advantage > of clarity there's another reason to adopt something like the vocab > I've suggested for talking about LD (and RDF). Which I can sum up in > two principles: > > The Web is about aboutness. > Aboutness on the web is purely pragmatic - a matter of norms > governing how we use/treat things, not what they intrinsically > (objectively, naturally, etc.) are. > > The third of the four "properties" listed in the intro (which draws on > TBL's note) is "the name IRIs > <http://json-ld.org/spec/latest/json-ld/#dfn-iri>, when dereferenced, > provide more information about the thing". My impression is that most > people take "dereferenced" to be the key term in that clause. But > that's wrong; the key term is "about". And I suspect that a lack of > clarity about what "about" is about is the source of much of the > confusion that has always accompanied semantic web talk in its many > forms. There are at least three varieties of aboutness involved. > (Ok, I know this is starting to sound very arcane and philosophical > but bear with me - in the end it is very simple, clear, and easily > explainable by example.) > > * Denotational aboutness. We use IRIs to name (refer to, denote) > things. This is a purely pragmatic matter; IRIs do not in and of > themselves name anything. Only insofar as we treat them as names > do the function as names. (Note that the English meaning of > "about" may cause confusion here - we don't normally say that e.g. > "The name 'Napoleon' is about Napoleon". So here "aboutness" just > means directed to something.) > * Implicit claim aboutness. Given <a > href="http://.../Napoleon.html">Napoleon</a>, the practical norm > is that the HTML document named by the URI should be about > Napoleon, at least in general; implicitly, this syntax expresses a > claim that the HTML page is about Napoleon. The critical point > here is that this is implicit; the formal requirement is only that > the browser should arrange for the URI to be dereferenced with > "Napoleon" is clicked. Nothing in the syntax is defined as a > claim. That the content should be about Napoleon is a matter of > social convention (norms). > * Explicit claim aboutness. We want to be able to say something > more than simply "this webpage is about Napoleon"; for example, we > want to be able to express the claim that Napoleon's wife was > Josephine. There is no way to do this implicitly. You could > design an XML language that includes a "Napoleon" tag with a > "wife" attribute, but we want generality. RDF provides one > solution to this problem - it explicitly (more or less) stipulates > that a triple is to be taken as a claim about its first term referent. > > (I just made this up so the language can no doubt be significantly > improved but I think it gets the point across.) > > (Incidentally, this approach suggests a way of presenting RDF that may > be an improvement on the S-P-O vocabulary. E.g. in RDF a claim is > expressed as a topic plus a comment about the topic. The comment > consists of a qualifier and a complement. Yielding > Topic-Qualifier-Complement treated as Topic-Comment, instead of S-P-O. > etc.) > > Now we're in a position to see the problem with LD "definitions". > They don't say what kind of aboutness is involved where dereferencing > occurs. If it were only a matter of dereferencing IRIs to yield data > about something then the HTML web is by definition a Linked Data web. > But it seems to me that the criterion of demarcation should be > whether or not we can make explicit, qualified claims. (By > "qualified" I mean that the middle term of a triple serves to qualify > the relation between the topic and complement, e.g. in "Franklin > invented bifocals", "invented" tells us what kind of relation obtains > between Franklin and bifocals.) > > Both RDF and JSON-LD are species of the genus of making explicit > claims about things. It isn't clear to me if LD is too. > > Ok, so the potential payoff with respect to JSON-LD is that this > vocabulary of claims and aboutness would allow us to explicitly > address the core of what RDF is about without talking explicitly about > RDF. So for Example 1 from the spec, one could introduce the concept > of expressing a claim about something, show the JSON-LD expression, > and explicate it in terms of topic (the person, Manu Sporny) and > comment (his homepage is at http//...). This could be done using any > number of regimented quasi-formal schemes, including pseudo-English. > (Note by the way that in many languages it is the norm to talk in > just this way: instead of "Manu Sporny's homepages is http://..." one > says something like "Many Spornu, his homepage is http:...") > > Having said all that, I can live with the spec as it is; the WG need > not spend time formulating any kind of official response to this. I > just wanted to provide some feedback (and I confess I think the stuff > about pragmatics, aboutness, and claims is kind of an interesting > approach so I wonder if anybody else does too.) JSON-LD will sink or > swim on its technical merits; either way, relatively few people will > read the spec (anybody read the SQL spec lately?). If it takes off, > we'll see lots of blog posts and some books explaining it. So the > non-normative sections just need to be "good enough". > > Thanks for all the hard work, > > Gregg Reynolds > > Gregg, To qualify a few things about statements such as RDF != Linked Data in association with de-reference, I would like to qualify a few things. Web-like Structured Data Representation: TimBL's meme outlined a principled approach structured data representation that results in a Data Web or Web-like structured data, courtesy of HTTP URIs. This approach makes the Web-like structured data scale to the expanse of the scale-free Web. In his original meme [1] he indicated that this principled approach enables one to look-up what an HTTP URI denotes, while also indicating to publishes that useful information should be accessible from the look-up location. In the revised meme [2] he added "using standards (RDF, SPARQL). This introduced the problem of using words that mean different things to different audiences which I break down as follows (un-pejoratively): 1. RDFers -- RDF and Linked Data (this thing with appreciative momentum) are now inextricably linked, we'll show them now! 2. RDF-Refluxers -- Linked Data is just a re-branding of RDF, they think we are stupid! 3. Independents -- WTF! (pardon my French, but I want to signal as effectively as possible in this response). In reality, bearing in mind my proximity to TimBL re. these matters of Linked Data, I genuinely believe he meant (but of course I do not speak for him): Use standards such as RDF and SAPRQL as an effective (productive) way to provide really useful information when HTTP URIs (as outlined in this note) are looked-up. In reality, when you get round to implementing Linked Data (as a publisher) all you have to do is redirect user agent URI look-up requests to a SPARQL protocol URL which leaves the heavy lifting to a SPARQL processor (which may or may not be a full blown DBMS engine). As for this whole JSON-LD and RDF affair, there is one subtle detail that makes matters more challenging. JSON-LD is seeking to be published as a deliverable from the RDF group. In taking the aforementioned route, many of the RDF related push-backs become much more understandable. Anyway, here are some links for additional context re. my comments re. Linked Data as Web-like structured data: 1. http://www.w3.org/DesignIssues/diagrams/history/proposal-fig1.gif -- illustration from original Web design document (this is clearly depicting a Data Web woven together via the typed relations illustrated as connections/connectors) 2. http://www.w3.org/2005/Talks/1110-iswc-tbl/#(4) -- that's Linked Data 101 3. http://www.w3.org/2005/Talks/1110-iswc-tbl/#(7) -- URIs + HTTP (this is makes Data Web-like) 4. http://dig.csail.mit.edu/2007/Talks/0511-tab-tbl/#(10) -- Linked Data (again, clearly defined and distinct from RDF). Inserting Logic into Linked Data (i.e., taking it from our "minds eye" to a Data Web accessible to humans and machines) is where RDF kicks in, with aplomb. Unfortunately, due to issues associated with OWL misconceptions compounded by RDF/XML dominating OWL examples, many RDFers are reluctant to utter the words "inference" or "reasoning" since they assume those are the issues that scare folks. A current example of that is easily discernible form some of the longer threads on the W3C's LDP list. Pat Hayes gave a very nice presentation on Blogic [1] that puts this issue of Logic and Data Webs in scope. Links: 1. http://slidesha.re/18CtxGK -- Blogic 2. http://videolectures.net/iswc09_hayes_blogic/ -- What's in a Link? -- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca handle: @kidehen Google+ Profile: https://plus.google.com/112399767740508618350/about LinkedIn Profile: http://www.linkedin.com/in/kidehen
Attachments
- application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Thursday, 13 June 2013 17:02:21 UTC