snapshots, was Re: Stable locking from Sandro Hawke on 2014-04-07 (public-ldp-wg@w3.org from April 2014)

From: Sandro Hawke <sandro@w3.org>
Date: Mon, 07 Apr 2014 14:54:54 -0400
To: ashok.malhotra@oracle.com, "public-ldp-wg@w3.org" <public-ldp-wg@w3.org>
Message-ID: <5342F47E.40502@w3.org>

On 04/07/2014 11:29 AM, Ashok Malhotra wrote:
> Some of our developers have also been struggling with the paging issue.
> In the discussion one potential solution has emerged which may be 
> worth considering.
>
> When a client accesses a collection and starts to page thru it, the 
> server makes a copy
> of the collection (snapshot).  It then serves that client from that 
> snapshot.  The snapshot
> is deleted when the clients commits or aborts.

Yes, I called this "static" or "snapshot" paging.

But this is something one wants with or without paging.  It's a general 
thing that one wants a web server to do.   And it can be quite expensive 
to provide.  So I suggest(ed) we keep it orthogonal.

That is, some servers provide the feature where they allow clients to 
snapshot a resource, and use that snapshot for some time.   This would 
be useful for paging, certainly.

This is applicable outside of LDP.    http://www.w3.org/TR/ldp/ has a 
snapshot that's been http://www.w3.org/TR/2014/WD-ldp-20140311/ for some 
time.  For a while before that, the snapshot was 
http://www.w3.org/TR/2013/WD-ldp-20130730/ .

One simple design for this would be a Link rel=snapshot header. The 
server could include that link on any resource for which it's willing to 
provide a snapshot, and the link would point to that snapshot.    The 
actually text of link would probably include the timestamp or the 
etag.   Or a version number.

This design has the advantage of being very simple, but it has some 
weaknesses.    For resources that will have many versions that are never 
snapshot'd, it's more expensive for the server than necessary, because 
the server needs to some work on every request, not just ones where the 
client wants to snapshot.   That could be addressed with a Prefer: 
make-snapshot request from the client, I suppose.

Another weakness is it doesn't provide a way to get a consistent 
snapshot of multiple resources, such as a container and its contained 
resources.   The simplest protocol I can think of for that is to allow 
adding a timestamp to the Prefer: make-snapshot.     So the client would 
say

    Prefer: make-snapshot date="Mon, 07 Apr 2014 14:48:24 -0400"

But that requires the server to be able to reconstruct the previous 
states of resources in the recent past.     For my applications that 
might be a reasonable thing to require of the server, but it's still a 
lot.     Probably better to let a multi-resource snapshot wait until 
there's a clearer use case, where we can see how much server 
implementors would be willing to do.

     -- Sandro

Received on Monday, 7 April 2014 18:55:05 UTC