Re: Important Question re. WebID Verifiers & Linked Data from Henry Story on 2011-12-22 (public-xg-webid@w3.org from December 2011)

From: Henry Story <henry.story@bblfish.net>
Date: Thu, 22 Dec 2011 09:48:37 +0100
To: Kingsley Idehen <kidehen@openlinksw.com>
Cc: Patrick Logan <patrickdlogan@gmail.com>, WebID XG <public-xg-webid@w3.org>
Message-Id: <C495A461-1678-4630-8C63-145CB9D45D4D@bblfish.net>
On 22 Dec 2011, at 04:42, Kingsley Idehen wrote:

> On 12/21/11 3:45 PM, Patrick Logan wrote:
>> 
>> See below...
>> 
>> On Wed, Dec 21, 2011 at 8:27 AM, Henry Story <henry.story@bblfish.net> wrote:
>> 
>> On 21 Dec 2011, at 14:58, Kingsley Idehen wrote:
>> >
>> > Please understand that RDF != Linked Data. It's just one of the options for creating and publishing Linked Data.
>> 
>> I think it would be very nice to have a formal spec on what Linked Data is. We do have a few for
>> RDF.
>> 
>> Yes, please. I understand the RDF-related specifications. And I understand the general notion of "linked data". I am someone following the WebID effort, and who is contemplating the costs and benefits of supporting it in my products at some point.
>> 
>> Unless I see a WebID specification (or a more general "world-wide linked data specification") for how to support linked data beyond RDF, how can I estimate the costs and benefits of supporting linked data beyond RDF?
>> 
>> Please understand that "publishing and consuming any and all possible interpretations of 'linked data'" is probably impossible. 
>> 
>> So what are the specific requirements?
>> 
>> -Patrick
>> 
> Patrick,
> 
> Here are the fundamental requirements:
> 
> 1. a Data Item (Object or Entity) is uniquely identified by a URI based Name
> 2. use de-referencable URIs as Names that resolve to *descriptor* documents (resources) that describe the URI's referent 
> 3. descriptor documents (resources) should *consist* or *bear* structured data in the form of eav/spo triples (or 3-tuples) statements that collectively form a directed graph pictorial that coalesces around the description subject's URI .
> 
> You can make statements eav/spo statements using a variety of syntaxes. 
> You can serialize eav/spo bearing resources across the wire using a variety of data serialization formats. 
> You can leverage HTTP as a low cost and effective mechanism for:
> 
> 1. de-referencable URI based Names
> 2. negotiation of data serialization formats between servers and clients (user agents).
> 
> The WebID spec can require or suggest a number of common formats for eav/spo triple transmission as the basis for effective bootstrap. 
> The WebID spec should never (overtly or covertly) be a tool for fighting or prolonging  syntax of format wars re. eav/spo triples. 
> The WebID spec should never leak the abstraction inherent in URIs by mandating http: scheme URIs.
> The WebID spec should never compromise the fidelity of Linked Data by favoring a particular style of de-referencable URI.
> The WebID spec is a spec. It shouldn't attempt to teaching software engineering. 
> 
> All of the above is possible without adversely affecting WebID. In short, all of this will make WebID attractive to much broader developer profiles that extend beyond the RDF based Semantic Web community. 
> 
> What's the difference between RDF and EAV? 
> 
> RDF explicitly (via spec) requires the use of URIs in S, P, and O (optionally) slots of triple statements. It also handles typed literals and language tags. Basically, it caters for locale issues and i18n. 

In fact RDF semantics does not do this. It is just the RDF serialisations that do, and for not such bad reasons. RDF semantics allows numbers in predicate positions, though these don't have much use there. There is even an example in the spec about this here http://www.w3.org/TR/rdf-mt/ !

You can map every EAV to SPO, and can in fact infer from each. SPO is much cleaner theoretically than EAV as it is purely relational, and everything can be mapped down to relations in the end.

So really all that Kinglsey is doing is change the language because he thinks that it is the language that was the problem with RDF, where in
fact I think there were quite a number of other issues that were the problem, such as no decent tooling and no understanding from the developers
of even REST at the time. XML had just came out and it promised to do everything, which shows just how syntactically people were thinking at the time. 

In the mean time Java came out and .net and the idea that multiple languages could be used to program to the same machine. That the machine itself could be virtual. And so the notion of semantics is very widely spread in the procedural space.

 So in my view this is the wrong time to shift. It is as if you had asked people about Object Oriented programming in 1992. There would have been huge arguments about its complexity (c++) or academicity (eifel, ...), then Java come out and 5 years later you could not program without doing OO. 

OWL is just OO for relations btw.

> 
> EAV isn't as specific as RDF with regards to the items above. At the same time EAV is well known, and doesn't carry the political baggage of RDF. 

I am not sure there is a Political baggage of RDF. All linked data does RDF and it is growing very fast. People are understanding that RDF is about semantics and relations. and I get very little negative feedback about RDF. Even Facebook is publishing Turtle now! 

> 
> What's the difference between RDF and Linked Data?
> There is nothing in the RDF spec (even as I write) that specifies the behavior of URIs used in SPO triples. For instance RDF doesn't explicitly distinguish between a URI that serves as a Name and a URI that serves as a Resource Locator (Address). 

And that is very good. RDF is about semantics and logics. Linked Data is pattern of publication of this data. Linked Data still has to be fully specified by the W3C, but it is obviously exactly how Tim Berners Lee intended RDF to be used. This was not understood by the RDF logicians he employed necessarily because they had a different background.


> Linked Data is very specific about URI behavior. It expects URI based Names and Addresses to be distinct.  This is why it's always problematic to infer (overtly or covertly) that RDF == Linked Data. The *truth* of the matter is that RDF is one approach to constructing directed graph based descriptor resources (comprised of eav/spo triples) that result in Linked Data at various scales (local area network or wide area network e.g., the InterWeb).

RDF is mathematically defined, therefore it covers all ways of doing relational stuff. you can of course re-invent it, but then it would take no time to map that back to rdf, so your time would be wasted.

I will therefore continue speaking in terms of RDF and Linked Data. Those are clearly understood, extremely well specced out, and there is now 12 years of accumulated knowledge in that space. We  also have the tools to do things there.  Plus we are at the W3C here and we can use the vocabulary. NASA uses RDF, governments use RDF, companies are using them, social networks are using it. All in more and more linked data manners.

Perhaps some crowds need to be talked to in terms of EAV, but we are not going to write a spec in latin to make the vatican happy too either.

Henry

> 
> Regards,
> 
> Kingsley Idehen	      
> Founder & CEO 
> OpenLink Software     
> Company Web: http://www.openlinksw.com
> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
> Twitter/Identi.ca handle: @kidehen
> Google+ Profile: https://plus.google.com/112399767740508618350/about
> LinkedIn Profile: http://www.linkedin.com/in/kidehen
> 
> 
> 
> 

Social Web Architect
http://bblfish.net/
Received on Thursday, 22 December 2011 08:49:21 UTC