Re: decentralized use cases from Kevin Page on 2012-08-06 (public-ldp-wg@w3.org from August 2012)

From: Kevin Page <kevin.page@oerc.ox.ac.uk>
Date: Mon, 06 Aug 2012 19:58:08 +0100
To: Erik.Wilde@emc.com
Cc: public-ldp-wg@w3.org
Message-ID: <1344279488.1708.176.camel@pootle>
Hi,

On Thu, 2012-07-12 at 04:33 -0400, Erik.Wilde@emc.com wrote:
> coming from the REST background, i am still struggling a bit to understand
> an important aspect of building a platform, and so far my questions
> regarding this have not created a lot of feedback.

I've been catching up on the WG and have noticed this being raised --
but not, afaict, resolved -- in a number of threads. So this is mostly
to say: yes, I think there is a distinction here; yes, I am interested
in this and the issues it entails, e.g. non-RDF resources; and I would
be happy if this were on-topic for the WG (doesn't mean it is!).

I wonder if there is divergence of expectation of what a client-server
LDP interaction is doing.

I see a difference between a service providing RDF data that is browsed,
retrieved, combined, etc. where the resources are exposed and organised
according to how the data itself is modelled. Here an interaction with
the service is primarily about directly modifying that graph. I think
this covers traditional Linked Data, the SPARQL graph store protocol
(and perhaps even the member submission?). Are we saying that all
RESTful application interactions can be boiled down to adding, removing,
or changing bits of a graph?

>From experience in a couple of projects, we haven't found this
sufficient when designing a service to support RESTful client-server
interaction -- particularly for writing data (no guarantee we haven't
just been wrongheaded, of course ;)  ). You need additional resources to
support the application that's being performed over/on/with the data
(the RESTful application state changes, if you like) and also a
mechanism to codify that interaction through e.g. link relations or
media types -- and here I think there is scope for the WG to provide
clarify and best practice (if it's on topic)

I think it is this "control" over the API Erik refers to -- the control
beyond generic graph manipulation of the complete underlying dataset.
Not to say access control isn't also important, of course. 

(and fwiw, in the uses cases I've experienced this inevitably involves
resources that aren't RDF-derived, although you'll likely have reached
them via an RDF resource. RDF is not the right encoding or
representation for everything. These resources may sensibly be out of
scope for an LDP spec, but I think there are many cases where they will
be provided by the same server, so consideration of this interaction
seems important.)

Now if we consider a description of the service (what's been sometime
referred to as the surface here) to *be* (part of?) the data exposed by
the LDP does the distinction become blurred? Is there one combined graph
of both whereby the "internal" data could kept private using access
controls while the surface resources are public? Can the same mechanisms
be used to interact with service surface resources or direct data
manipulation, i.e is this a false distinction? Are BPRs and BPCs
sufficient for this and can we build RESTful application interactions
with them in combination with other resources?

Apologies for getting to this late, the start of the WG caught me on the
hop schedule wise. I'm about to go on leave for a few weeks but would be
keen to see discussion on this aspect continue.

So a vote for "yes this is important".

cheers,

Kevin.



>  therefore i am still
> curious and interested to get some answers or maybe just comments or
> opinions. building a platform means it is something that others can build
> on. REST has two central aspects how it makes sure that no bad things can
> happen, and that things scale well:
> 
> - service interactions transfer state, not raw data. while state is of
> course data, it is different, because in most scenarios it means the
> transfer of intent, and not of raw facts. if i order something, i transfer
> my order intent, and if that order is accepted by the service, it will be
> transformed into whatever that service uses to internally represent a
> factual order (and how that service does that is none of my business). the
> service might give me access to my pending/completed orders, but again
> that's state info, and better be cleaned of information about other
> customers' orders, as well as internal info that i am not supposed to see
> (such as an internal ratings how much a service values me as a customer).
> so services decouple the service surface from the implementation
> internals, and that's crucial in many scenarios. coupling between service
> clients and providers happens on the service surface level, not at the
> level of how a service manages data.
> 
> - state transfer is based on expectations about what is being transferred,
> and often this means there are schemas and there is validation. schemas
> are the first line of defense, they are a declarative way of formalizing
> what a service expects to happen in the context of service interactions.
> in many cases, schemas are half-formal and half-informal, and often they
> also are contextual (a loose schema is accepted when i submit something, i
> can expect a strict schema when i get a response because the service has
> performed some cleanup). as a side-note, schematron has a nice way of
> handling this through the notion of "phases", something that other schema
> languages could learn from... without schemas, REST services become almost
> impossible to define, because they are needed to encode the
> "preconditions" and "postconditions" in all service interactions. if those
> conditions aren't met, there are standardized ways (such as HTTP status
> codes) to signal that disagreement happened on that interaction level.
> 
> these two mechanisms are the building blocks of REST in terms of isolating
> a platform's surface from its internals, and from the feedback so far it
> seems that the majority of the WG is in favor of going the route that andy
> described so well as a "shared, schemaless database." what i am struggling
> with is to reconcile the notion of that database approach, and a platform,
> which for me is something that implicitly means decentralized settings and
> trust relationships which do not necessarily imply complete and unlimited
> access to a service's database. i'd really appreciate if somebody could
> attempt to explain to me what i am missing. is it just that complete trust
> is implicitly assumed and REST's protective measures are not needed?
> 
> the concrete reason i am asking is that we are very interested in using
> linked data as a platform, but we also cannot go the route of a shared
> database model. we must have control over what a linked data platform
> exposes and what it does not expose, and we must have control over who can
> add what and when to a linked data platform. these are the questions we
> need to solve, and i am still trying to figure out how to best solve those
> questions in the context of this WG.
> 
> thanks and cheers,
> 
> dret.
> 
>
Received on Monday, 6 August 2012 18:58:32 UTC