RE: Pagination (ISSUE-42) from Andrew Hacking on 2015-02-13 (public-hydra@w3.org from February 2015)

From: Andrew Hacking <ahacking@gmail.com>
Date: Fri, 13 Feb 2015 11:51:32 +1000
To: Markus Lanthaler <markus.lanthaler@gmx.net>
Cc: public-hydra@w3.org
Message-ID: <CAMAVcL9C=1ODCLFpaLtEicdZ6z6NcUTntfH7hvntLtRUZxF7Jw@mail.gmail.com>
I am happy for 'page' to be dropped, however....

When exchanging partial collections I see the members as distinct resources
referenced from the "collection" resource.

When asking for an entire collection it is a distinct resource from
whatever range of resources it may include in its membership. It can use
whatever linking scheme is appropriate; serial next/prev/first/last pages
and/or templates allowing random access to "page" resources, or (my
preferred) offset/limit random access template.

So to me "collections" should really just be about meta data for the
membership and include links to the members or "pages" of the collection,
either explicit via a serial paging scheme, and via template with ranged
values for random access.

What I mean by the above is that collections should not have any related
resource content in them as their primary content.

Orthogonal to the above, related resources can be included in a collection
payload via a yet to be defined "side loading" mechanism. Such a mechanism
could be used by any resource endpoint to return related resources due
efficiency/latency and 1+N query explosion considerations. Example, get a
collection of people and then get their related address and organisation
resources.

A "page" or "range" is also a collection resource and just represents a
subset of the full collection. Just like the full collection to which it
belongs it would represent a set of members and only contain links (or link
templates) to related resources. For efficiency, it can also include
representations of its members through the "side loading" construct.

So with the above scheme it should be obvious that the ability to include
related resources in any resource needs to be a first class hydra concern.
>From that primitive, collections, pages and ranges just fall out.

Providing a mechanism for including related resources and avoiding 1+N
queries is a real world problem that must be solved in all web APIs I have
worked on anyway.

Without a viable approach to side loading, developers will ignore such
things as hydra or work on other less considered but perhaps more practical
approaches just to get the job done. That's basically where I am at.

To take a developers perspective, as much as I would like to, I can't say
that I can consider adopting hydra as it stands because collections and
side loading are not solved problems and I'm not confident based on what I
am seeing proposed thus far that it can be viable. There is also a huge rdf
universe that taints/comes along for the ride as cognitive baggage that is
quite frankly too big a bootstrap process for onboarding developers and
getting productivity.

Given the current state, the lowest risk approach is to do what is easiest
with the tools at hand and send proprietary json 'REST like' payloads, i.e.
do as almost everyone else is doing currently and ignore the semantic web
and high level interoperability. If the service becomes viable and
valuable, people will adapt their code to work with the proprietary api,
get locked in and further increase its value vs getting bogged down in
semantic purism for decades and getting constrained to ill fitting api
constructs and having to argue for basic use cases to be supported. I am
playing devils advocate here of course, but seriously where is the payoff
in hydra/standards when you don't get the basics needed for a modern
web/mobile app?

Random access is required for hydra to be viable for my application APIs,
which are not special requirements, just basic requirements for modern
web/mobile apps. This is continuing to be overlooked in the proposals as if
we are back in the 1990s with server rendered pages and page navigation
links.

If hydra is going to expressly define semantics for "paged collections"
(which I don't think it needs to do as per my proposal above) then it needs
to also expressly address random access use cases. Putting a toe into this
area needs to do justice to the real world use cases which to date have not
been addressed at all.

TL;DR

* support side loading of related resources in any resource as a first
class construct;
* "collections" and "pages" are just resources that link to member
resources;
* all resources including "collection", "page"/" range" may use the side
loading primitive to include related member resources for efficiency AND to
address 1+N query explosion
* if paging is to become a first class concern in hydra then it must
address random access to those pages (ie using templates).

Regards,

Andrew
 On 13 Feb 2015 08:42, "Markus Lanthaler" <markus.lanthaler@gmx.net> wrote:

> It seems, there's quite some pushback to separate paginated collections
> into collections and pages. My feeling is that even getting rid of the
> PagedCollection type seems to be the preferred approach at the moment. This
> would mean that there's always a sequence of collections but if there's
> just one, there's obviously no need to link to others.
>
> Let me try to gauge the preferences by making an alternative design
> proposal:
>   - remove the type PagedCollection
>   - rename itemsPerPage to numberItems
>   - drop the "Page" suffix from firstPage, nextPage, previousPage, lastPage
>
> I'm a bit worried about dropping the "Page" suffix, but wanted to bring it
> up nevertheless.
>
> With this new design, a paginated collection would look like this:
>
>   {
>     "@id": "http://api.example.com/an-issue/comments?whatever=3",
>     "@type": "Collection",
>     "first": "/an-issue/comments",
>     "previous": "/an-issue/comments?whatever=2",
>     "next": "/an-issue/comments?whatever=4",
>     "last": "/an-issue/comments?whatever=498",
>     "member": [ ...]
>   }
>
> What we lose by this is the ability to distinguish whether a single "page"
> or the complete collection was referenced. It would always be the complete
> collection. A client is thus expected to (potentially) retrieve the entire
> collection to find the information it was looking for. AFAICT, this
> shouldn't be a problem in practice as our collection design doesn't allow
> any inferences; all the relationships have to be defined explicitly.
>
> Ruben, I would be especially interested in hearing your opinion since you
> created ISSUE-42 [1] and mentioned that by implementing the LDF server it
> occurred to you that the current design is confusing. Do you see any
> practical issues with such a design?
>
>
> --
> Markus Lanthaler
> @markuslanthaler
>
>
>
>
Received on Friday, 13 February 2015 01:52:04 UTC