Re: Fwd: Re: Can we use TCP backpressure instead of paging? from Sandro Hawke on 2014-06-11 (public-ldp@w3.org from June 2014)

From: Sandro Hawke <sandro@w3.org>
Date: Wed, 11 Jun 2014 17:27:49 -0400
To: Andreas Kuckartz <a.kuckartz@ping.de>, public-ldp@w3.org
Message-ID: <5398C9D5.5050807@w3.org>
On 06/11/2014 05:20 AM, Andreas Kuckartz wrote:
> These communities could and should be involved in addition to LDP:

In general, our options as a working group are to (1) officialyl ask 
another group to review our stuff, which they might or might not do, and 
(2) just announce it to the world, and hope one of their members notices 
and they read it and comment.   It would be good to maintain a list of 
groups for (1).

On your list:

> - http

Definitely.   We have several points of coordination there, but asking 
for a formal review of the paging spec on going to LC would be good.

> - xmpp

The style seems so different, that doesn't seems like a useful review to 
me.   Am I missing something?

> - W3C Social Interest Group (which hopefully starts in a few days)

Why the SocWeb IG instead of WG?   We have lots of coordination planned 
with the SocWeb WG (including me and Arnaud).

> - Hydra
> - Linked Data Fragments

My sense was these were academic efforts, not organizations/groups.   
I'd hope people involved in those efforts would be on this list.


> Some of them potentially can influence browser manufacturers.

On the tcp backpressure thing, I think we have more direct routes.

> I also think that this might be relevant for the IoT.

Like everything else :-)

        -- Sandro


> Cheers,
> Andreas
> ---
>
> David Booth:
>> Wow, very intriguing idea!  This really seems worth further
>> investigation.  If this approach works out it could address the whoel
>> paging issue across the board rather than piecemeal by each spec.
>>
>> David
>>
>> On 06/10/2014 02:03 PM, Sandro Hawke wrote:
>>> On 06/10/2014 01:34 PM, Andreas Kuckartz wrote:
>>>> -------- Original Message --------
>>>> To: Sandro Hawke <sandro@w3.org>
>>>> CC: Linked Data Platform WG <public-ldp-wg@w3.org>
>>>>
>>>> Thanks a lot for thinking outside the current box of paging. I have
>>>> looked at several paging approaches and do not really like any of them.
>>>> They contaminate the real data and/or seem to be unnecessarily complex
>>>> to implement.
>>> You're welcome.     I did a little more investigation today, including
>>> writing a tiny node.js server that streams data and lets me see what
>>> what happens when I do things on the client.    This little bit of
>>> testing was encouraging.
>>>
>>> I also spoke to a few people at lunch here at the W3C AC meeting and
>>> again no one saw any serious problem.
>>>
>>> No clue yet how likely it is that browser vendors might implement an
>>> extension to xhr to allowed WebApps to use this properly.
>>>
>>>           -- Sandro
>>>
>>>> Cheers,
>>>> Andreas
>>>> ---
>>>>
>>>> Sandro Hawke:
>>>>> Thinking about paging a little, I'm really wondering if one isn't
>>>>> better
>>>>> off using TCP backpressure instead of explicit paging.  It would have
>>>>> the huge advantage of requiring little or no special code in the client
>>>>> or the server, if they already implement high-performance streaming.
>>>>> (I started thinking about this because as far as I can tell, if we want
>>>>> to allow LDP servers to initiate paging, we have to require every LDP
>>>>> client to understand paging.   That's a high bar.   If you want to
>>>>> respond to that particular point, please change the subject line!)
>>>>>
>>>>> The key point here is that TCP already provides an elegant way to
>>>>> handle
>>>>> arbitrarily large data flows to arbitrary small devices on poor
>>>>> connections.    If someone knows of a good simple explanation of this,
>>>>> please send along a pointer.   My knowledge is largely pre-web.
>>>>>
>>>>> In web software we often to think of HTTP operations as atomic steps
>>>>> that take an arbitrary long time.   With that model, doing a GET on a
>>>>> 100G resource is pretty much always a mistake.  But nothing in the web
>>>>> protocols requires thinking about it that way.   Instead, one can think
>>>>> of HTTP operations as opening streams which start data flowing.
>>>>>
>>>>> In some situations, those streams will complete in a small number of
>>>>> milliseconds, and there was no advantage to thinking of it as a stream.
>>>>>     But once you hit human response time, it starts to make sense to be
>>>>> aware that there's a stream flowing.     If you're a client doing a
>>>>> GET,
>>>>> and it's taking more than maybe 0.5s, you can provide a better UX by
>>>>> displaying something for the user based on what you've gotten so far.
>>>>>
>>>>> What's more, if the app only needs the first N results, it can read the
>>>>> stream until it gets N results, then .abort() the xhr.   The server may
>>>>> produce a few more results than were consumed before it knows about the
>>>>> .abort(), but I doubt that's too bad in most cases.
>>>>>
>>>>> The case that's not handled well by current browsers is pausing the
>>>>> stream.   In theory, as I understand it (and I'm no expert), this
>>>>> can be
>>>>> done by simply using TCP flow control.   A non-browser app that stops
>>>>> reading data from its socket will exert backpressure, eventually
>>>>> resulting in the process writing data finding the stream's not ready
>>>>> for
>>>>> writing.   My sense is that can and does work rather well in a wide
>>>>> variety of situations.
>>>>>
>>>>> Unfortunately, as I understand it, this doesn't work in WebApps today,
>>>>> because the browser will just keep reading and buffering until it runs
>>>>> out of VM.   If instead xhr (and websockets) had a limit on how much it
>>>>> would buffer, and webapps could set that (and probably it starts around
>>>>> 10M), then a WebApp that stopped consuming data would produce
>>>>> backpressure that would result in the server learning it can't send any
>>>>> more yet.     When the WebApp consumes more data, the server can start
>>>>> sending again.
>>>>>
>>>>> I'm very curious if there's any technical reason this wont work.   I
>>>>> understand there may be problems with some existing software, including
>>>>> browsers, not handling this kind of streaming.  But is there
>>>>> anything in
>>>>> the basic internet protocols and implementations that make this not
>>>>> work?     For instance, it may be that after blocking for a long time
>>>>> (minutes, waiting for the user to request more), restarting is too
>>>>> slow,
>>>>> or something like that.
>>>>>
>>>>> One possible workaround for the lack of browser support would be for
>>>>> servers to be a bit smarter and make some guesses.  For example, a
>>>>> server might say that requests with User-Agent being any known browser
>>>>> should be served normally for the first 10s, then drop to a much slower
>>>>> speed, consuming resources in the server, the net, and the client.
>>>>> WebApps that want to sidestep this could do so with a Prefer header,
>>>>> like Prefer initial-full-speed-duration=1s or 1000s.    At some point,
>>>>> when browsers allow webapp backpressure, those browser User-Agent
>>>>> strings could be exempted from this slowdown.
>>>>>
>>>>>        -- Sandro
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>
Received on Wednesday, 11 June 2014 21:27:57 UTC