Re: Sending very large chunks over the data channel from Harald Alvestrand on 2014-05-28 (public-webrtc@w3.org from May 2014)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Wed, 28 May 2014 12:44:10 +0200
To: Tim Panton new <thp@westhawk.co.uk>
CC: public-webrtc@w3.org
Message-ID: <5385BDFA.6000406@alvestrand.no>
On 05/28/2014 12:35 PM, Tim Panton new wrote:
>
> On 28 May 2014, at 11:02, Harald Alvestrand <harald@alvestrand.no 
> <mailto:harald@alvestrand.no>> wrote:
>
>> On 05/28/2014 11:46 AM, tim panton wrote:
>>> On 28 May 2014, at 10:18, Harald Alvestrand <harald@alvestrand.no 
>>> <mailto:harald@alvestrand.no>> wrote:
>>>
>>>> On 05/28/2014 10:52 AM, Wolfgang Beck wrote:
>>>>> Adding another data transfer protocol on top of SCTP wll not solve 
>>>>> your problem.
>>>>>
>>>>> The Websocket-style API is the problem.
>>>>> It does not allow the JS to delay reception and does not tell you 
>>>>> when it is apporpriate to send more data.
>>>>>
>>>>> Sending a chunk and wait for an ACK? That means you will spend 
>>>>> most of the time waiting for Acks instead of
>>>>> transmitting data. Of course you can somehow negotiate how many 
>>>>> chunks you can send without having to
>>>>> wait for an ACK. Now you have re-implemented a substantial part of 
>>>>> SCTP, probably with more errors and less sophistication.
>>>>>
>>>>> What's wrong with the Streams API?
>>>> The first thing wrong about the Streams API as described in the 
>>>> link below is that it does not preserve message boundaries; a 
>>>> Stream is a sequence of bytes.
>>>>
>>>> Our chosen abstraction is a sequence of messages.
>>>>
>>>> Something like the Streams API may be a Good Thing (and applicable 
>>>> to websockets too), but the current proposal just has the wrong 
>>>> model for our purposes.
>>>>
>>>> If you have a suggestion to bridge the gap, please bring it forward.
>>> Thinking some more about the back pressure issue, how about an 
>>> optional callback on send()
>>> onSendSpaceAvailable(int amount)
>>> which gets called whenever it is next possible to send and the block 
>>> size that is available.
>>
>> I think a lot could be done if we just defined the semantics of send():
>>
>> - send either succeeds fully or fails fully. There is no partial 
>> success. (Needed to ensure integrity of messages.)
>> - the channel should stay up after a failed send (nothing got sent, 
>> this is not fatal)
>
> I have a philosophical  problem with  that. Say the requested 
> semantics of the channel are : sequenced, reliable message delivery.
> Now you have the possibility that one of the sent() messages will 
> never be delivered, but a subsequent one will.

That's why I want the semantics of send() to be totally binary outcome: 
Either it's sent, or it's not.
If it's not sent, it doesn't enter the sequence.

> If you are sending incremental changes (think database transactions) 
> and one in the middle is dropped but the channel
> remains operational, you have broken that semantic.

Only if the sending code assumes that a failure is ignorable. Caveat emptor.
(Many files have been lost to code written under the assumption that 
write() will always succeed... "what do you mean - disk full?")

>
>> - the code for "the buffer is full, try later" and "this message is 
>> too big to ever send" should be different (so that we can know the 
>> difference between "back off" and "you must be kidding")
>
> perhaps that works for the unreliable case, but I don't like it for 
> reliable.
>
>>
>> There's a difficulty in the callback definition you suggest - I think 
>> it needs to say what size it is waiting for. Otherwise, we get a 
>> sequence like:
>>
>> -> send(20000) -> fail, temporary too big
>> <- spaceAvailable(100)
>> -> send(20000) -> fail
>> <- spaceAvailable(1000)
>> -> send(20000) -> fail
>>
>> and so on. That's not right.
>
> Agreed. Given that perhaps we need a bulkSend( ) which takes an array 
> of messages or even a function that generates messages.
> bulkSend( function(avail) { return /nextmessageOfAvailBytes} );

Now we're getting complicated :-)

>
> T.
>
>>
>>
>>>
>>> T.
>>>
>>>>> Wolfgang
>>>>>
>>>>> On 05/27/14 09:37, Stefan Håkansson LK wrote:
>>>>>> This was discussed at the f2f, and the Streams API was mentioned, 
>>>>>> but as
>>>>>> Harald pointed out yesterday the applicability of Streams with 
>>>>>> the data
>>>>>> channel is not clear yet.
>>>>>>
>>>>>> But there is another option that is supported right now. The blob
>>>>>> (defined in http://dev.w3.org/2006/webapi/FileAPI/) supports 
>>>>>> slicing up
>>>>>> data chunks in smaller parts (and of course re-assembling them back).
>>>>>> So, it is quite simple to split up a large chunk in smaller ones, and
>>>>>> then add some simple acking on the app layer (hold back sending next
>>>>>> slice until the previous one is acked).
>>>>>>
>>>>>> This is not elegant, but should work.
>>>>>>
>>>>>> The quota API 
>>>>>> (https://dvcs.w3.org/hg/quota/raw-file/tip/Overview.html)
>>>>>> allows for a bit more sophistication, but it seems to be supported by
>>>>>> Chrome only (and then only an older version of the API).
>>>>>>
>>>>>> Stefan
>
Received on Wednesday, 28 May 2014 10:44:47 UTC