Interaction and REST focus from Kjetil Kjernsmo on 2012-11-08 (public-ldp@w3.org from November 2012)

From: Kjetil Kjernsmo <kjetil@kjernsmo.net>
Date: Thu, 08 Nov 2012 20:56:56 +0100
To: public-ldp@w3.org
Message-ID: <2291156.t6iTF20v8M@owl>
Hi all!

First, I should congratulate you with such an excellent first public working 
draft of the Linked Data Platform! However, in doing so, you violated the 
first principle of FPWDs of my former boss chaals: He said that FPWDs 
should be really bad, to bring out all the fury of the people just sitting 
idly by and thus get all the arguments to the surface before putting too 
much effort into it. And while I appreciate your work, I shall have to come 
out in all my fury, which is too bad, since you already wrote a very 
comprehensive and thorough spec. ;-) 

I think we, as a community, have become somewhat too narrow-minded and 
constrained to HTTP due to the Linked Data principles. That is not to say 
that HTTP isn't the most suitable protocol and that HTTP URIs aren't the 
best URIs, they are, but it steals focus from what we should be thinking 
about first. 

I think the charter has a clear mandate for a non-HTTP focus, even though 
it 
also has a clear misunderstanding of HTTP in REST. This misunderstand is so 
common that Roy Fielding has commented on it in a blog post:
http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven

"A REST API should not be dependent on any single communication protocol," 
which is broken in the very first requirement: "4.1.1 LDPR servers must at 
least be HTTP/1.1 conformant servers".

So, the problem here is that people generally agree that REST is important, 
but they aren't sure why. Now, let me shamelessly plug my paper from the 
ESWC LAPIS2012 workshop: "The necessity of hypermedia RDF and an approach 
to achieve it", with paper and slides (you may prefer the latter) 
respectively at 
http://folk.uio.no/kjekje/2012/hypermedia-rdf.pdf
http://folk.uio.no/kjekje/2012/lapis2012.xhtml

In there, I hint at a different focus: See, the thing isn't just that we 
need to use HTTP verbs to read or write stuff, that's the trivial part. 
What we need to think about is what kind of interactions made possible by 
the protocol. Now, with this spec, if I have some resource, I still have to 
look up your out-of-band specification to figure out how I can edit it, 
though with HTTP, I can figure out that I can edit by looking at the Allow 
header by taking the extra effort of doing a HEAD first.

In my talk, I said that REST implies that "don't make developers look up 
the spec", let me reformulate that a bit: "Systems that require developers 
to frequently look up specifications will over time loose to systems that do 
not", or something like that. REST is there because it is far easier to 
just "View Source" and then you have what you need to do the Right 
Thing[tm]. Moreover, by having what you need right there, up front, in the 
message, you can automate things that you cannot automate if you need a 
human looking up the spec. That's partly why RDF is self-describing, right?

So, that's what we need to focus on: The message, not the protocol!

In my paper, I propose not only ways to express what write operations are 
allowed, but also some links to read-only resources. However, it is much 
more to be explored there. In the original Linked Data Design Issue, timbl 
introduces the concept of a "Browsable Graph":
http://www.w3.org/DesignIssues/LinkedData
where part of the definition is "Returning all statements where the node is 
a subject or object; and Describing all blank nodes attached to the node by 
one arc. " (go check it if the context isn't clear)
Note the object part, because I think we have a potential for a lot of 
confusion and incompatibilities there, since the current spec says:
"4.4.5 A LDPR client must preserve all triples retrieved using HTTP GET 
that it doesn’t change whether it understands the predicates or not, when 
its intent is to perform an update using HTTP PUT. "

So, following timbl's advice, we would PUT back triples that has the 
Request-URI as object, not subject. OMG, the horror! ;-) 

It gets worse though, because in near future, the server might have much 
more data than is reasonable to communicate in a single message, and paging 
might be the wrong solution, since it doesn't say anything about what is 
relevant by any measure... 

I think that timbl's "browsable graphs" will also be obsolete in many 
cases.

You can't just return an RDF molecule, or a concise bounded graph, or an 
spo = Request-URI ?p ?o. Many of the reasons are already listed in the next 
section "Limitations on browseable data" in the Design Issue. Instead, 
Linked Data Information Architects (who is the first to have such a 
title? :-) ) would have to think carefully about the interactions they 
would enable by the data. And then, we're not just talking about replacing 
a resource or merging more triples into it. Perhaps there are a whole lot 
of sensor readings for example for a certain resource, they may want to 
provide some digest and link to more readings, carefully thinking about 
what that digest should be, just like you're not putting all your relevant 
links in a hypertext document, that's up to the author. In some cases, the 
first thing an agent might want to do is to put sensor data that pertains 
to a certain resource, and lots of it. Making this interaction efficient 
will be the chief concern of the Linked Data Information Architect. Without 
this, Linked Data servers will be just like the web pages of 1996 that 
contained mostly links to other pages, but very little new information of 
value... 

However, this WG cannot foresee these developments, and it is a 1.0 spec 
after all. What I expect of 1.0 is a basic set of interactions and some 
constraints on what triples can be expected to be returned and what MUST be 
accepted and what SHOULD be accepted and what MUST NOT. 

To understand this last point, lets go back to the SPARQL Graph Store.
I helped conceive that spec back in the day, thinking we would one day 
understand what REST could do for us. Then, we got distracted by HTTP, 
indirect identification and all that kind of stuff, and it ended up in 
something non-RESTful. The main difference, as I see it, is this:

With the Graph Store spec, the Request-URI is just a name for a bunch of 
triples. It could be just about any bunch of triples, the Request-URI is 
not even likely to be subject or object of any of the triples, but I guess 
it could be.

On the Linked Data Platform, this is not the case, at the very least you'd 
expect that the Request-URI has to be the subject of some of the triples in 
the result, and the platform may legitimately reject or ignore triples sent 
to it that doesn't have the Request-URI as the subject, or some other 
criteria. This makes the Linked Data Platform much more complex than just a 
Graph Store, but other than that, in particular considering the 
interactions you can do with it, it is the same beast.

The interactions you are allowed to do, must be immediately clear when you 
have a representation of a resource, that's the most efficient way to do 
things, and to do that, it must be part of the message somehow.

I believe this is where most of the effort should be spent, because it is 
not going to be easy. Once that's done, adding HTTP is going to be easy. It 
would be particularly nice if that could be done in-band too, i.e. in terms 
of a vocabulary. In my talk, I proposed doing stuff like this on every 
resource that can be POSTed to:
<> hm:canBe hm:mergedInto .

Then, the object of that can be defined in a vocabulary like:
hm:mergedInto rdfs:comment "Perform an RDF merge of payload into 
resource"@en ;
              hm:httpMethod "POST" .

Finally, I have only suggested a vocabulary, but this you could turn into 
an ontology. With HTTP verbs, all you're going to get is a literal. :-)

BTW, I'm landing in Boston for ISWC on Friday, and I'm all for a f2f chat 
and hack session on Saturday if anybody wants to meet up. My simple linked 
data platform is the Perl module RDF::LinkedData, which is on CPAN, and 
also in Debian and Ubuntu as librdf-linkeddata-perl. I have some unreleased 
code on github for the read-write support ideas, at least many of the tests 
are there.

Cheers,

Kjetil
-- 
Kjetil Kjernsmo
PhD Research Fellow, University of Oslo, Norway
Semantic Web / SPARQL Query Federation
kjetil@kjernsmo.net           http://www.kjetil.kjernsmo.net/
Received on Thursday, 8 November 2012 19:57:29 UTC