[LDP Paging] Comparison to other techniques of pagination from Austin William Wright on 2014-09-10 (public-ldp-comments@w3.org from September 2014)

From: Austin William Wright <aaa@bzfx.net>
Date: Wed, 10 Sep 2014 16:38:00 -0700
To: public-ldp-comments@w3.org
Cc: kjetil@kjernsmo.net
Message-ID: <CANkuk-V9Sh-B8_-X5_+4z_x0Nj3TgND-peL=5U54Tg8_q2xZZw@mail.gmail.com>

While reading through the Linked Data Paging TR at <
http://www.w3.org/TR/2014/WD-ldp-paging-20140909/>, I wondered a number of
questions about the intended purpose.

As I understand, the problem is working with large sets of data that might
be very large, and the issue of data changing in the middle of a multi-part
download if the data is split up into distinct resources.

The first thing that strikes me is that it /seems/ generic solutions for
paginating data have already been invented:

* next/prev and particularly the next-archive/prev-archive Link relations
* Using the Range: HTTP request header, perhaps with a new "pages" or
"items" unit
* Content-Location header to indicate that a different resource was returned
* Various technologies for signaling that content has updated, e.g.
append-only journals, RFC 5989

How does this solution compare to existing pagination solutions, and what
problems does this TR uniquely solve? What are some examples of when it
would be inappropriate to use LDP Paging, but maybe some other pagination
instead?

I suppose this issue has come up before, so could the rationale/comparison
be added (informatively) to the introduction?



RESTful clients:

If clients have to be "paging aware", would that break their RESTful
nature? Specifically, do clients need to make assumptions about the data
that they're working with to support these features? I suppose not, if the
functionality is both generic and optional, but nonetheless, I find it hard
to see how a generic user agent like a Web browser would be able to make
use of these features, unless it has some external/pre-programmed notion of
what the resource it gets back is going to be.



Domain of problem being solved:

Very little in the TR seems specific to the LDP use case. Could it be made
more generic for use in other applications? Have we considered pulling in
any other audiences, working groups, or community groups for their feedback?

I work on a Web service that makes heavy use of pagination with prev/next
links, both consuming and producing. It seems like this TR could be useful,
if it were described in a more technology-neutral way (e.g. number of list
items, instead of specific kinds of RDF statements or number of
triples/statements).



Content Negotiation:

I currently use Content-Location with a 200 status code to indicate to
clients "This is not (necessarily) the URI that you requested, but here's a
negotiated variant". I use a very loose definition of negotiated variant to
mean that the request URI might be a non-information resource (like a
person). This thus sidesteps the issue of httpRange-14 because queries
about NIRs are never directly answered.

I do this for any dimension that content can vary over, like Content-Type,
locale, CSS/template theme, features (XForms vs. HTML forms, inline SVG,
...), and pagination. If a resource that can be paginated is requested,
default values for and "offset" and "limit" are assumed, and I will return
e.g.:

Content-Location: http://example.com/?html;offset=0;limit=20

It seems implicit, in the course of discussion for the proposed 2NN
(Contents of Related) status code, that this is not the case; maybe instead
that a 200 response with Content-Location makes an Information Resource
available at multiple URIs? Is this true, or can we live without the 2NN
status code and use only Content-Location? Perhaps I should ask www-tag?

In any event, I would consider it necessary to create a solution that works
with the widest variety of user agents, without a 3xx redirect.



Thanks,

Austin Wright.

Received on Wednesday, 10 September 2014 23:38:28 UTC