Re: lossless paging using HATEOAS from Sandro Hawke on 2014-02-19 (public-ldp-wg@w3.org from February 2014)

From: Sandro Hawke <sandro@w3.org>
Date: Wed, 19 Feb 2014 16:09:54 -0500
To: Steve Speicher <sspeiche@gmail.com>
CC: Linked Data Platform WG <public-ldp-wg@w3.org>
Message-ID: <53051DA2.5070500@w3.org>
On 02/19/2014 02:15 PM, Steve Speicher wrote:
> On Wed, Feb 19, 2014 at 12:12 PM, Sandro Hawke <sandro@w3.org 
> <mailto:sandro@w3.org>> wrote:
>
>     Here's an implementation technique servers can use to do lossless
>     paging very cheaply (give or take whatever the underlying paging
>     technology is).
>
>     In the NEXT and PREV links, include the boundary value. That way
>     the server doesn't need to remember anything; the state is all in
>     the URL.  HATEOAS to the rescue.   (I'm sure I'm not the first to
>     think of this....)
>
>
> We've used this model for some time, encoding the boundaries into the 
> URLs of the pages themselves.  Usually based on some time-based 
> property of the resources, such as when it was created, though others 
> have used other properties from the data that the servers assign to 
> the resources.
>

So, I'm thinking we can say that IF servers provide paging, they MUST 
provide page URLs which, if traversed, will result the complete contents 
being seen.   When triples are added to or deleted from a paged resource 
during paging, the server MAY update the pages, but it MUST provide URLs 
such that, even when the paged resource is changed during traversal, 
clients will see only triples that were present at some point during 
traveral, and never see any triples that were not present at some point 
during traversal.

A server MAY terminate a traversal, no longer proving pages at some URLs 
it has given out.  If it does this, it MUST thereafter answer those URLs 
with 410 GONE.

      -- Sandro



> -  Steve Speicher
>
>
>     As an example for a directory with these names (people at the last
>     meeting), in alphabetic order:
>
>             Arnaud Le Hors
>             Ashok Malhotra
>             Cody Burleson
>             Eric Prud'hommeaux
>             Henry Story
>             John Arwe
>             Miguel Aragón
>             Nandana Mihindukulasooriya
>             Roger Menday
>             Sandro Hawke
>             Steve Speicher
>
>     The container offers these headers:
>
>            Link: <.../my_container?page_first> rel=FIRST
>            Link: <.../my_container?page_last> rel=LAST
>
>     If you GET that rel=FIRST one, you'll get this:
>
>            Link: <.../my_container?page_after=Henry%20Story> rel=NEXT
>
>            member & container triples for Arnaud Le Hors
>            member & container triples for Ashok Malhotra
>            member & container triples for Cody Burleson
>            member & container triples for Eric Prud'hommeaux
>            member & container triples for Henry Story
>
>     If you GET that rel=NEXT one, you'll get this:
>
>            Link: <.../my_container?page_after=Sandro%20Hawke> rel=NEXT
>            Link: <.../my_container?page_before=John%20Arwe> rel=PREV
>
>            member & container triples for John Arwe
>            member & container triples for Miguel Aragón
>            member & container triples for Nandana Mihindukulasooriya
>            member & container triples for Roger Menday
>            member & container triples for Sandro Hawke
>
>     If you GET that rel=NEXT one, you'll get this:
>
>            Link: <.../my_container?page_before=Steve%20Speicher> rel=PREV
>
>            member & container triples for Steve Speicher
>
>     Meanwhile, if you decided to traverse backwards from the container's
>     rel=LAST one, you'd get this:
>
>            Link: <.../my_container?page_before=Miguel%20Arag%C3%B3n>
>     rel=PREV
>
>            member & container triples for Miguel Aragón
>            member & container triples for Nandana Mihindukulasooriya
>            member & container triples for Roger Menday
>            member & container triples for Sandro Hawke
>            member & container triples for Steve Speicher
>
>     If you GET that rel=PREV one, you'd get:
>
>            Link: <.../my_container?page_after=John%20Arwe> rel=NEXT
>            Link: <.../my_container?page_before=Ashok%20Malhotra> rel=PREV
>
>            member & container triples for Ashok Malhotra
>            member & container triples for Cody Burleson
>            member & container triples for Eric Prud'hommeaux
>            member & container triples for Henry Story
>            member & container triples for John Arwe
>
>     If you GET that rel=PREV one, you'd get:
>
>            Link: <.../my_container?page_after=Arnaud%20Le%20Hors> rel=NEXT
>
>            member & container triples for Arnaud Le Hors
>
>     See?  No state on the server, and lossless paging, with fully
>     controlled page size.  Servers which are already saving paging
>     state for some reason can do it some other way.
>
>     A few details:
>
>     The value in the page_after and page_before fields would be the
>     value of the sort field, whatever it is.
>
>     If the composite of the sort fields is not guaranteed to be
>     unique, or there is no sort field, the server should add another
>     sort field, some kind of internal rowid, to make it unique.
>
>     If there are multiple sort fields, the server could mush them
>     together, or use multiple page_before/page_after parameters in the
>     URL.
>
>     If the data is particularly sensitive, the server might want to
>     encrypt the fields.
>
>     As an middle-ground solution, with a little state, the server
>     could hide the values and make the URLs a lot shorter by
>     maintaining a cache of boundary values, like
>        { "bv1": "Henry Story",
>           "bv2": "John Arwe",
>          ... }
>
>     then instead of
>            Link: <.../my_container?page_after=Henry%20Story> rel=NEXT
>     it would do
>            Link: <.../my_container?page_after_code=bv1> rel=NEXT
>
>     In this case, the client would need to be prepared for 410 GONE in
>     case the bv1 value had expired.
>
>        -- Sandro
>
>
>
Received on Wednesday, 19 February 2014 21:10:02 UTC