Re: Why Range doesn't work for LDP "paging" (cf 2NN Contents-of-Related)

Hello Sandro, others,

On 2014/09/16 10:13, Sandro Hawke wrote:
> Earlier today the LDP Working Group discussed the matter of whether we
> could use range headers instead of separate page URIs.  Use of Range
> headers was suggested on this list recently.
>
> Our conclusion was still "no", for the following reasons.  Please let us
> know if you see a good solution to any/all of them:
>
> 1.  We don't know how the server would initiate use of Range.   With our
> current separate-page design, the server can do a 303 redirect to the
> first page if it determines the representation of the entire resource is
> too big.   The question here is what to do when the client didn't
> anticipate this possibility.  True, the 303 isn't a great solution
> either, since unprepared clients might not handle it well either.
> Perhaps one should give a 4xx or 5xx when the client asks for a giant
> resource without a range header...?   But there's no "representation too
> big" code defined.

Can't you still use a 303 if there's no indication that the client 
understands tuple ranges?

> 2.  We don't know how we could do safe changes.  With our current
> design, it's possible for the resource to change while paging is
> happening, and the client ends up with a representation whose inaccuracy
> is bounded by the extent of the change.  The data is thus still usually
> perfectly usable.  (If such a change is not acceptable, the client can
> of course detect the change using etags and restart.)   This bounded
> inaccuracy a simple and practical concept with RDF (in a way it isn't
> with arbitrary byte strings). Just using Range, a deletion would often
> result in data unrelated to the change being dropped from what the
> client sees.

Why isn't this the case in your solution? In order to work, don't you 
essentially have to remember exactly how far the client read? If you 
have various clients, one that started before the first change, one 
after the first but before the second change, and so on, how is the 
server going to keep track of how far the client got?


> I suppose perhaps one could use some kind of tombstones
> to avoid this problem, not closing in gaps from deletion.  Basically, a
> client might ask for triples 0-9 and only get 3 triples because the
> others were deleted?  Does that make sense with Range?   Is it okay to
> not have the elements be contiguous?

It definitely wouldn't make sense for byte ranges, but I think it should 
be okay if you define tuple ranges to work that way.


> 3.  Many of our usual RDF data systems don't support retrieval of ranges
> by integer sequence numbers.   While some database systems have an
> internal integer row number in every table that could be used for Range,
> many others do not, and we don't know of a straightforward and
> appropriate way to add it.

So how are you going to implement paged views? I'd be surprised if there 
are no sequence numbers but each tuple has a page number.


> 4.  Finally, there was some question as to whether the Web
> infrastructure has any useful support for non-byte ranges.   This is
> perhaps not an objection, but it came up during the discussion, and we'd
> be interested in any data people have on this.

By infrastructure, do you mean caches? I don't think there is much 
support yet, but I'm not an expert.


> Bottom line is we still think just using rel=first/last/next/prev, among
> distinct resources, is a pretty reasonable design.   And if we're doing
> that, it'd be nice to have 2nn Contents-of-Related.

Maybe this question has come up before: If you have 1M of tuples, and 
decide that you have to serve them in pages of 1K, how much efficiency 
do you gain by having the first download short-circuited, i.e. what's 
the efficiency gain of one roundtrip saved over 1000 roundtrips?

With a range-based design, various ranges can be downloaded in parallel, 
or the client can adjust ranges based on throughput,..., but with your 
rel=first/last/next/prev design, you seem to be much more constrained.


Regards,     Martin.

Received on Tuesday, 16 September 2014 08:10:45 UTC