Re: snapshots, was Re: Stable locking from ashok malhotra on 2014-04-07 (public-ldp-wg@w3.org from April 2014)

From: ashok malhotra <ashok.malhotra@oracle.com>
Date: Mon, 07 Apr 2014 18:16:18 -0400
To: public-ldp-wg@w3.org
Message-ID: <534323B2.5090807@oracle.com>
Sandro, see inline

On 4/7/2014 4:25 PM, Sandro Hawke wrote:
> On April 7, 2014 3:45:19 PM EDT, ashok malhotra <ashok.malhotra@oracle.com> wrote:
>> Hi Sandro:
>> Yes, snapshotting can be useful for purposes other than paging.
>> Some databases offer a "flashback" capability that allows you to
>> go back to a previous version of a table to recover from errors, etc.
>>
>> But for paging I had a somewhat different design in mind.
>> Assume that we design a prefer header that may have values "snapshot"
>> or "rolling" where rolling gives you the current state of the
>> collection
>> even as it changes from underneath you.
>>
> Note that paging is about graphs, not containers.    Even if it's restricted to containers, it's still only about the graph that is the state of the container, and has nothing to do with whether items in the container change state.
Yes.  Snapshots of graphs.
>
>> When a request comes in with prefer=snapshot, the server makes a copy
>> of the collection and its members.  It then serves those pages. When
>> the
>> client moves away from the collection the snapshot is deleted.
>   Thus,
>> the
>> snapshot is created on demand only for a specific paging request.
>>
> So you're suggesting we have snapshot paging and rolling paging, but no snapshots unless you're doing paging?   That seems likely to be annoying, like sometimes you'd want a snapshot, and be frustrated that you can't get one since the resource isn't big enough to page. I don't have an immediate use case for snapshots outside of paging, though.
The design I am suggesting creates snapshots per client, per paging episode.
I don't disagree that general-purpose snapshots that serve multiple clients would be
useful but I think that's a much more complex situation to design for.
>
> Also, how can the server tell that the client has "moved away"?    That's not really a notion in http....    I fear the best we can do is have timeouts.   I suppose if the snapshots were per-client, the client could delete them when done.   But I think we want the server to allow a snapshot to be used by multiple clients, so delete isn't right.
We need transactions :-)
Perhaps delete the snapshot when the client terminates his connection.
Once the snapshot has been created, storage is cheap!
>
>     - Sandro
>
>> Ashok
>>
>> 4/7/2014 2:54 PM, Sandro Hawke wrote:
>>> On 04/07/2014 11:29 AM, Ashok Malhotra wrote:
>>>> Some of our developers have also been struggling with the paging
>> issue.
>>>> In the discussion one potential solution has emerged which may be
>> worth considering.
>>>> When a client accesses a collection and starts to page thru it, the
>> server makes a copy
>>>> of the collection (snapshot).  It then serves that client from that
>> snapshot.  The snapshot
>>>> is deleted when the clients commits or aborts.
>>> Yes, I called this "static" or "snapshot" paging.
>>>
>>> But this is something one wants with or without paging.  It's a
>> general thing that one wants a web server to do.   And it can be quite
>> expensive to provide.  So I suggest(ed) we keep it orthogonal.
>>> That is, some servers provide the feature where they allow clients to
>> snapshot a resource, and use that snapshot for some time. This would be
>> useful for paging, certainly.
>>> This is applicable outside of LDP. http://www.w3.org/TR/ldp/ has a
>> snapshot that's been http://www.w3.org/TR/2014/WD-ldp-20140311/ for
>> some time.  For a while before that, the snapshot was
>> http://www.w3.org/TR/2013/WD-ldp-20130730/ .
>>> One simple design for this would be a Link rel=snapshot header. The
>> server could include that link on any resource for which it's willing
>> to provide a snapshot, and the link would point to that snapshot.
>> The actually text of link would probably include the timestamp or the
>> etag.   Or a version number.
>>> This design has the advantage of being very simple, but it has some
>> weaknesses.    For resources that will have many versions that are
>> never snapshot'd, it's more expensive for the server than necessary,
>> because the server needs to some work on every request, not just ones
>> where the client wants to snapshot.   That could be addressed with a
>> Prefer: make-snapshot request from the client, I suppose.
>>> Another weakness is it doesn't provide a way to get a consistent
>> snapshot of multiple resources, such as a container and its contained
>> resources.   The simplest protocol I can think of for that is to allow
>> adding a timestamp to the Prefer: make-snapshot.     So the client
>> would say
>>>      Prefer: make-snapshot date="Mon, 07 Apr 2014 14:48:24 -0400"
>>>
>>> But that requires the server to be able to reconstruct the previous
>> states of resources in the recent past.     For my applications that
>> might be a reasonable thing to require of the server, but it's still a
>> lot.     Probably better to let a multi-resource snapshot wait until
>> there's a clearer use case, where we can see how much server
>> implementors would be willing to do.
>>>
>>>      -- Sandro
>
>
Received on Monday, 7 April 2014 22:16:51 UTC