Re: Fwd: Re: Can we use TCP backpressure instead of paging? from Stian Soiland-Reyes on 2014-06-13 (public-ldp@w3.org from June 2014)

From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Date: Fri, 13 Jun 2014 16:15:02 +0100
To: Sandro Hawke <sandro@w3.org>
Cc: Andreas Kuckartz <a.kuckartz@ping.de>, LDP <public-ldp@w3.org>
Message-ID: <CAPRnXt=_xoT35_Sqxsir79cDf1pzGoazqhVQbsfZmoFoVg03EA@mail.gmail.com>

On 13 June 2014 15:07, Sandro Hawke <sandro@w3.org> wrote:
> Yes, I suppose that's the canonical answer.      I wonder a little how big a
> scrollable list humans want compared to the available bandwidth.    I mean,
> the typical web page is about 1MB I gather, and human-readable text that
> would be take many hours to read.  So do you really need to offer more text
> than that, dynamically available?

Well, the big cost is not the main textual content in there, but the
additional data structures. So for instance getting the JSON from

https://api.github.com/users/taverna/repos

as you see responds very fast, but only list the first 20 repositories
(and so is easy to present as 20 lines of text to the user), but
including lots of metadata for each (so that sub-calls are avoided) -
the server does not know which of these the client is interested in,
but still has to trundle through databases etc. to create that
structure - that is the real cost, not the tiny kilobytes transferred.

> I think you missed my point there.   Isolation between readers and writers
> is an issue even without paging.    With a single page resource, we still
> have exactly the same question.   Servers will need to be serializing graphs
> to answer GET request while a PATCH, PUT, and POST are changing the same
> graphs.   Does the server need to isolate the two from each other?
> Probably, but that can be quite expensive, and I don't know of a spec that
> even says servers SHOULD do that.

No, they might or might not do that.. So let's say I am fetching that
list of repositories above (but as LDP container if you like): -
midway through one of them is updated with two commits to two
repositories. Do I expect both their updated_at field to change?
Well, if you make each page "small enough" it becomes practically not
really an issue.

One way to achieve this in some applications is to use small pages and
SQL Transaction isolation levels - that way you can present a
consistent view of the whole resource (a kind of snapshot). However
the server would want to close transactions sooner rather than later
to avoid rollback issues, running out of memory, etc.

However I find it interesting to look at HTTP Ranges with triples as a
unit, and it should be something this and other W3C groups should be
able to push through to IETF.

One could perhaps wish for an even higher-level unit for LDP (per
member) so that you don't have to stop midway through because a
particular member has a few extra triples. If you call that something
generic like "members" it could be applicable even to old-school REST
APIs.

-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001-9842-9718

Received on Friday, 13 June 2014 15:15:51 UTC