Re: Pagination (ISSUE-42) from Dietrich Schulten on 2015-02-14 (public-hydra@w3.org from February 2015)

From: Dietrich Schulten <ds@escalon.de>
Date: Sat, 14 Feb 2015 08:55:31 +0100
To: public-hydra@w3.org
Message-ID: <54DEFF73.2000504@escalon.de>
Hi Andrew,

Am 13.02.2015 um 02:51 schrieb Andrew Hacking:
> 
> What I mean by the above is that collections should not have any related
> resource content in them as their primary content.

See proposal below - would that cut it?

> 
> Orthogonal to the above, related resources can be included in a
> collection payload via a yet to be defined "side loading" mechanism.
> Such a mechanism could be used by any resource endpoint to return
> related resources due efficiency/latency and 1+N query explosion
> considerations. Example, get a collection of people and then get their
> related address and organisation resources.
> 

To illustrate some problems with the current design which uses
hydra:Collection as a container: Currently a person with addresses and
organization resources would look like this, which I fear appears plain
silly for a JSON user.

{
  "@context": {
    "knownBy": {
      "@reverse": "foaf:knows",
      "@type": "@id"
    },
    "houses": {
      "@reverse": "ex:address",
      "@type": "@id"
    },
    "hasAsMember": {
      "@reverse": "ex:organization",
      "@type": "@id"
    }
  },
  "@id": "/alice",
  "foaf:name": "Alice",
  "collection": [
    {
    "@id": "/alice/friends",
    "@type": "Collection",
      "manages": {
        "property": "foaf:knows",
        "subject": "/alice"
      },
      "hydra:member": [{
        "@id": "/bob",
        "knownBy": "/alice"
      },
      {
        "@id": "/zelda",
        "knownBy": "/alice"
      }
      ]
    },
    {
    "@id": "/alice/addresses",
    "@type": "Collection",
      "manages": {
        "property": "ex:address",
        "subject": "/alice"
      },
      "hydra:member": [{
        "@id": "/home-address",
        "houses": "/alice"
      }]
    },

    {
    "@id": "/alice/organizations",
    "@type": "Collection",
      "manages": {
        "property": "ex:organization",
        "subject": "/alice"
       },
      "hydra:member": [{
        "@id": "/alice/organizations/amnesty",
        "hasAsMember": "/alice"
      }]
    }
  ]
}

The attributes knows, address and organization are all buried inside
collection objects, and all members must point back to /alice, otherwise
the model does not assert that alice has values for knows, address and
organization at all.


> 
> Without a viable approach to side loading, developers will ignore such
> things as hydra or work on other less considered but perhaps more
> practical approaches just to get the job done. That's basically where I
> am at.

+1 to get a design which is attractive for the non-RDF ReST community,
while at the same time making sense to an RDF reasoning tool.

> 
> TL;DR
> 
> * support side loading of related resources in any resource as a first
> class construct;
> * "collections" and "pages" are just resources that link to member
> resources;
> * all resources including "collection", "page"/" range" may use the side
> loading primitive to include related member resources for efficiency AND
> to address 1+N query explosion
> * if paging is to become a first class concern in hydra then it must
> address random access to those pages (ie using templates).

What do you have in mind when you say side-loading? Something which
allows the server (or the client?) to choose whether it wants to embed
the collection items or require another request to get them if the
response would be too large?

Borrowing from the discussion in thread [1] I think it could be achieved
by not using hydra:Collection as a container, but as a descriptor.

// server embeds a collection
{
  "@id": "/alice",
  "foaf:name": "Alice",
  "foaf:knows": [
    {"@id":"/bob", "foaf:name": "Robert Rumbaugh"},
    {"@id":"/zelda", "foaf:name": "Zelda Zackney"}
  ],
  "collection": [
    {
    "@id": "/alice/friends",
    "@type": "Collection",
    "manages": {
      "property": "foaf:knows",
      "subject": "/alice"
    },
    "search" : ... an iritemplate,
    "operation" : ... supportedOperations on /alice/friends
  ]
}

Note that the hydra:Collection type has no members. Rather, the people
Alice knows can be found at foaf:knows.


// server points to external resource with offset/limit
{
  "@id" : "/alice"
  // plain link to friends:
  "foaf:knows" : { "@id": "/alice/friends" },
  // saying things about the management of /alice/friends:
  "collection": [
    {
    "@id": "/alice/friends",
    "@type": "Collection",
    "manages": {
      "property": "foaf:knows",
      "subject": "/alice"
    },
    "partial": {
        "@type": "IriTemplate",
     "template": /alice/friends{?offset,limit}
        "mapping": [
          {
          "@type": "IriTemplateMapping",
          "variable": "offset",
          "property": "hydra:offset"
          },
          {
          "@type": "IriTemplateMapping",
          "variable": "limit",
          "property": "hydra:limit",
          }
        ]
    }
  }
}

I introduced a new property hydra:partial.
The target resource returned from /alice/friends is a json-ld set of
foaf:Person, not a hydra:Collection.

[
  {"@id":"/bob",
   "@type": "http://xmlns.com/foaf/0.1/Person",
   "http://xmlns.com/foaf/0.1/name": "Robert Rumbaugh"
  },
  {"@id":"/zelda",
   "@type": "http://xmlns.com/foaf/0.1/Person",
   "http://xmlns.com/foaf/0.1/name": "Zelda Zackney"
  }
]

Since we cannot embed meta information about the entire collection into
the above json response (see [4] for the reason), we use Link headers
with IANA rels next, prev for paging (and for anything else we want to
link to, such as create-form[2]). Using Link headers is nothing new in
Hydra, we use it already for the ApiDocumentation. There is also a draft
for Link-Template headers[3] - with that we could even have the
offset/limit template as a header.

Inside the collection *items* we could use hyperlinks as usual. The only
restriction is that link relations which apply to the entire collection
must be either link headers of the collection response or they must be
described at the origin of the link (here: on the Alice resource).

The URI which identifies the entire collection is /alice/friends. The
URI which identifies a partial collection is
/alice/friends{?offset,limit}. Clients SHOULD use partial if it is
available.

Comments? Flames?

Best regards,
Dietrich

[1] http://lists.w3.org/Archives/Public/public-hydra/2015Feb/0052.html
[2] https://tools.ietf.org/html/rfc6861#section-3.1
[3] http://tools.ietf.org/html/draft-nottingham-link-template-01
[4]
https://www.w3.org/community/hydra/wiki/Avoid_that_collections_%22break%22_relationships#Problem_description
> 
> Regards,
> 
> Andrew
> 
> On 13 Feb 2015 08:42, "Markus Lanthaler" <markus.lanthaler@gmx.net
> <mailto:markus.lanthaler@gmx.net>> wrote:
> 
>     It seems, there's quite some pushback to separate paginated
>     collections into collections and pages. My feeling is that even
>     getting rid of the PagedCollection type seems to be the preferred
>     approach at the moment. This would mean that there's always a
>     sequence of collections but if there's just one, there's obviously
>     no need to link to others.
> 
>     Let me try to gauge the preferences by making an alternative design
>     proposal:
>       - remove the type PagedCollection
>       - rename itemsPerPage to numberItems
>       - drop the "Page" suffix from firstPage, nextPage, previousPage,
>     lastPage
> 
>     I'm a bit worried about dropping the "Page" suffix, but wanted to
>     bring it up nevertheless.
> 
>     With this new design, a paginated collection would look like this:
> 
>       {
>         "@id": "http://api.example.com/an-issue/comments?whatever=3",
>         "@type": "Collection",
>         "first": "/an-issue/comments",
>         "previous": "/an-issue/comments?whatever=2",
>         "next": "/an-issue/comments?whatever=4",
>         "last": "/an-issue/comments?whatever=498",
>         "member": [ ...]
>       }
> 
>     What we lose by this is the ability to distinguish whether a single
>     "page" or the complete collection was referenced. It would always be
>     the complete collection. A client is thus expected to (potentially)
>     retrieve the entire collection to find the information it was
>     looking for. AFAICT, this shouldn't be a problem in practice as our
>     collection design doesn't allow any inferences; all the
>     relationships have to be defined explicitly.
> 
>     Ruben, I would be especially interested in hearing your opinion
>     since you created ISSUE-42 [1] and mentioned that by implementing
>     the LDF server it occurred to you that the current design is
>     confusing. Do you see any practical issues with such a design?
> 
> 
>     --
>     Markus Lanthaler
>     @markuslanthaler
> 
> 
>
Received on Saturday, 14 February 2015 07:56:07 UTC