RE: relate a Collection and a PagedCollection

On Wednesday, April 23, 2014 6:19 PM, Ruben Verborgh wrote:
> >> Note how "?page=3" is called a PagedCollection (but is a single page)
> >> or how "?page=4" is the "next page" of the PagedCollection (only pages
have a next
> >> page). or how "?page=1" is the "firstPage" of "?page=3".
> >
> > Yeah, I know. A PagedCollection is more like an "abstract" concept.
> 
> But this contradicts what you said earlier:

If so, it's because I have troubles explaining it properly. 


> >>> The reason why it is called PagedCollection and not CollectionPage is
that I
> >>> see a PagedCollection as the sum of all the pages it consists of.
> 
> So is it:
> a) the sum of all pages
> b) a single page
> c) both
> 
> At the moment, it behaves like c), which is quite unfortunate.

Let me answer with a question: What is an rdf:List?
  a) the sum of all list nodes (the things with rdf:first/rdf:rest
properties)
  b) a single list node
  c) both


> > The "problem" with renaming it to CollectionPage is that then we would
indeed
> > have to relate each page to a Collection to be able to express things
like
> > totalItems, firstPage, and lastPage.
> 
> That's where blank nodes come in handy:
>     :p5 hydra:pageOf [ hydra:firstPage :p1 ].
> meaning
>     page 5 is a page of something, here is its first page

I don't see how this improves things.. quite the contrary actually. I do
agree though that from a theoretical point of view this is more precise. On
the Web, however, theoretical pureness rarely wins. Pragmatism and
simplicity typically does.

Also, if you use blank nodes for this, what's the advantage? You won't even
know anymore whether different pages belong to the same collection.


> Seems to make more sense than
>     :x hydra:firstPage :p1
> which does *not* mean
>     "here is the first page of x"
> but
>     "here is the first page of the thing that :x also is a page of"

I don't really see a problem with the latter.


> > Especially the separation of
> > firstPage/lastPage from nextPage/previousPage is something I do not
like.
> 
> But it *is* like that in the real world.
> A book has a first page and a last page.
> A book doesn't have a next or a previous page.
> Page 5 of the book has a next page and a previous page.
> Page 5 of the book doesn't have a first page.

Surely correct, but if you look at how current systems are being build, the
distinction rarely matters. From search engine result pages to photo
galleries you almost always have next/previous page pointers and direct
links to the first and sometimes also last page. 


> So you have the choice between modeling it as the real world:
> - the domain of nextPage and previousPage is Page
> - the domain of firstPage and lastPage is Collection
> where the definition of those relations is simple:
> - nextPage and previousPage get the relating pages of the current page

Those relating pages are still pages of the collection, it just happens that
they are adjacent to the page you are currently looking at. So the next page
is "the page that follows the current page in the list of ordered pages that
a PagedCollection consists of" and the first page is "the first page in the
list of ordered pages that the PagedCollection consists of".


> - firstPage and lastPage get the relating pages of the current collection
> 
> Or modeling it in an abstract way
> - the domain of nextPage and previousPage is PagedCollection
> - the domain of firstPage and lastPage is PagedCollection
> where the definition of those relations is complex:
> - nextPage and previousPage get the relating pages of the current "page"
> - firstPage and lastPage get the relating pages of the current
"collection"
> 
> And this last part emphasizes indeed
> that the current PagedCollection is a hybrid between a page and a
collection.
> So truly option c) above. Which is confusing and unclear modeling IMHO.

I would like to hear other peoples' opinion on this. Do you think we need to
separate a PagedCollection into two things, namely a Collection (with links
to first/last page) and individual pages (with next/prev links) or is it
clear enough if we mingle all of those links into a PagedCollection as we
currently do?


> >>> Well. this certainly looks tidy but do you really need an "entry
point"
> >>> which just links to the first and last page but doesn't give you any
> >>> members?
> >>
> >> No, not an entry point, but it should be possible to refer to the
> >> collection as a whole.
> >> So think identifier (URI), not URL per se.
> >
> > This is closely related to rdf:List. Also there you just refer to the
first
> > node (of type rdf:List) which then points to the next node (which,
again, is
> > of type rdf:List).
> 
> It's not related. With rdf:List, I can point to the list,
> and I can point to the individual.

Which individual? A list looks as follows

  node1 a rdf:List
             rdf:first "Value of node 1"
             rdf:rest node2
  node2 a rdf:List
             rdf:first "Value of node 2"
             rdf:rest node3
  ...
  nodeN a rdf:List
             rdf:first "Value of node N"
             rdf:rest rdf:nil

So you point to the list by referencing node1... but you could also point to
node2 and the result would still be valid (but you wouldn't find node1
anymore). PagedCollection's are like double linked lists which might also
include links to the head and the tail:

  node1 a hydra:PagedCollection
             hydra:member "Members of page 1"
             hydra:firstPage node1
             hydra:nextPage node2
             hydra:lastPage nodeN
  node2 a hydra:PagedCollection
             hydra:member "Members of page 2"
             hydra:firstPage node1
             hydra:previousPage node1
             hydra:nextPage node3
             hydra:lastPage nodeN
  ...
  nodeN a hydra:PagedCollection
             hydra:member "Members of page N"
             hydra:firstPage node1
             hydra:previousPage nodeN-1
             hydra:lastPage nodeN

So in my opinion they are strongly related.

 
> With the hybrid PagedCollection that acts both as a page and a collection,
> I can point to neither.

Just as with lists, you can simply point to the first page. The difference
is that, as long as you provide first+next page or first+next or
last+previous on each page, you can always reconstruct the complete
PagedCollection no matter to which page you link.


> > So I would say if you want to refer to the collection,
> > just point to its first page
> 
> They are distinct concepts: the book is not the page.
> 
> "The item is on page 1"
> is different from
> "The item is in the book".
> 
> > (actually it doesn't really matter to which
> > page you link as long as the client is able to find all pages).
> 
> I think you're thinking too much about one practical application now.
> If the goal is just to find all pages of the collection, well yes. you can
do that.
> (And you could even do it with much simpler constructs.)
> 
> But the question is:
> can Hydra correctly describe collections and their pages?
> At the moment, it can't.

Can't it describe these things or is it that it doesn't describe these
things in the way you want them to be described?


> > Since we are consistent with how both rdf:List Paged Atom Feeds work
> > (RFC5005) I'm leaning towards keeping PagedCollections as they are.
> > Do you strongly disagree or could you live with that?
> 
> Strongly disagree.
> 
> First of all, we are not consistent with rdf:List,
> because rdf:List distinguishes between the list and its members.
> (Furthermore, rdf:List is structured as in Haskell with head/tail,
>  so that's different altogether.)

Could you elaborate? In don't see much difference in the snippet I provided
above.


> Second: messy modeling.
> Concrete concepts would be preferred over an abstract hybrid concept.

I certainly agree with this point. But it could also be solved differently.
The things that would really change is the metadata about the "complete
collection". So an alternative would be to rename

  firstPage to firstPageOfTheCollectionThisPageBelongsTo
  lastPage to lastPageOfTheCollectionThisPageBelongsTo
  totalItems to totalItemsOfTheCollectionThisPageBelongsTo

The result would be the same. Wouldn't it?


> RFC5005 was not too picky about modeling BTW:
> > "first" - A URI that refers to the furthest preceding document in
> >       a series of documents.
> A series of documents. Which series?
> The series the current document belongs to?
> 
> So it actually expresses a relationship between a series
> and its first document? But you express it on the current document?
> Messy.

:-)



--
Markus Lanthaler
@markuslanthaler

Received on Thursday, 24 April 2014 13:46:31 UTC