Re: snapshots, was Re: Stable locking

On 04/07/2014 06:16 PM, ashok malhotra wrote:
> Sandro, see inline
>
> On 4/7/2014 4:25 PM, Sandro Hawke wrote:
>> On April 7, 2014 3:45:19 PM EDT, ashok malhotra 
>> <ashok.malhotra@oracle.com> wrote:
>>> Hi Sandro:
>>> Yes, snapshotting can be useful for purposes other than paging.
>>> Some databases offer a "flashback" capability that allows you to
>>> go back to a previous version of a table to recover from errors, etc.
>>>
>>> But for paging I had a somewhat different design in mind.
>>> Assume that we design a prefer header that may have values "snapshot"
>>> or "rolling" where rolling gives you the current state of the
>>> collection
>>> even as it changes from underneath you.
>>>
>> Note that paging is about graphs, not containers.    Even if it's 
>> restricted to containers, it's still only about the graph that is the 
>> state of the container, and has nothing to do with whether items in 
>> the container change state.
> Yes.  Snapshots of graphs.
>>
>>> When a request comes in with prefer=snapshot, the server makes a copy
>>> of the collection and its members.  It then serves those pages. When
>>> the
>>> client moves away from the collection the snapshot is deleted.
>>   Thus,
>>> the
>>> snapshot is created on demand only for a specific paging request.
>>>
>> So you're suggesting we have snapshot paging and rolling paging, but 
>> no snapshots unless you're doing paging?   That seems likely to be 
>> annoying, like sometimes you'd want a snapshot, and be frustrated 
>> that you can't get one since the resource isn't big enough to page. I 
>> don't have an immediate use case for snapshots outside of paging, 
>> though.
> The design I am suggesting creates snapshots per client, per paging 
> episode.
> I don't disagree that general-purpose snapshots that serve multiple 
> clients would be
> useful but I think that's a much more complex situation to design for.

I don't really see how it's any more complicated.     The server can 
just include a Link: rel=snapshot header, either all the time, or when 
the client says Prefer: make-snapshot.

>>
>> Also, how can the server tell that the client has "moved away"?    
>> That's not really a notion in http....    I fear the best we can do 
>> is have timeouts.   I suppose if the snapshots were per-client, the 
>> client could delete them when done.   But I think we want the server 
>> to allow a snapshot to be used by multiple clients, so delete isn't 
>> right.
> We need transactions :-)

:-)    I think we're better without them, but that's probably a 
discussion for later.

> Perhaps delete the snapshot when the client terminates his connection.

I don't think webapps generally have any control over the connection.    
Conceptually, every HTTP operation uses a new connection.    With HTTP 
1.1 I believe the default is keep-alive, so connections are re-used, but 
... I don't think one is supposed to rely on that, and I bet there are 
circumstances where relying on keep-alive for paging would be a big 
problem.     But maybe.... It's an interesting idea.

If you're willing to pay the price of having different snapshots per 
client, then we could say clients SHOULD DELETE the snapshot when done 
with it.  And of course the server can remove it when it decides it's 
been around too long.

>
> Once the snapshot has been created, storage is cheap!

Not in RAM....

         -- Sandro
>>
>>     - Sandro
>>
>>> Ashok
>>>
>>> 4/7/2014 2:54 PM, Sandro Hawke wrote:
>>>> On 04/07/2014 11:29 AM, Ashok Malhotra wrote:
>>>>> Some of our developers have also been struggling with the paging
>>> issue.
>>>>> In the discussion one potential solution has emerged which may be
>>> worth considering.
>>>>> When a client accesses a collection and starts to page thru it, the
>>> server makes a copy
>>>>> of the collection (snapshot).  It then serves that client from that
>>> snapshot.  The snapshot
>>>>> is deleted when the clients commits or aborts.
>>>> Yes, I called this "static" or "snapshot" paging.
>>>>
>>>> But this is something one wants with or without paging. It's a
>>> general thing that one wants a web server to do.   And it can be quite
>>> expensive to provide.  So I suggest(ed) we keep it orthogonal.
>>>> That is, some servers provide the feature where they allow clients to
>>> snapshot a resource, and use that snapshot for some time. This would be
>>> useful for paging, certainly.
>>>> This is applicable outside of LDP. http://www.w3.org/TR/ldp/ has a
>>> snapshot that's been http://www.w3.org/TR/2014/WD-ldp-20140311/ for
>>> some time.  For a while before that, the snapshot was
>>> http://www.w3.org/TR/2013/WD-ldp-20130730/ .
>>>> One simple design for this would be a Link rel=snapshot header. The
>>> server could include that link on any resource for which it's willing
>>> to provide a snapshot, and the link would point to that snapshot.
>>> The actually text of link would probably include the timestamp or the
>>> etag.   Or a version number.
>>>> This design has the advantage of being very simple, but it has some
>>> weaknesses.    For resources that will have many versions that are
>>> never snapshot'd, it's more expensive for the server than necessary,
>>> because the server needs to some work on every request, not just ones
>>> where the client wants to snapshot.   That could be addressed with a
>>> Prefer: make-snapshot request from the client, I suppose.
>>>> Another weakness is it doesn't provide a way to get a consistent
>>> snapshot of multiple resources, such as a container and its contained
>>> resources.   The simplest protocol I can think of for that is to allow
>>> adding a timestamp to the Prefer: make-snapshot.     So the client
>>> would say
>>>>      Prefer: make-snapshot date="Mon, 07 Apr 2014 14:48:24 -0400"
>>>>
>>>> But that requires the server to be able to reconstruct the previous
>>> states of resources in the recent past.     For my applications that
>>> might be a reasonable thing to require of the server, but it's still a
>>> lot.     Probably better to let a multi-resource snapshot wait until
>>> there's a clearer use case, where we can see how much server
>>> implementors would be willing to do.
>>>>
>>>>      -- Sandro
>>
>>
>
>
>

Received on Monday, 7 April 2014 22:44:42 UTC