[whatwg] WebSocket bufferedAmount includes overhead or not

On 3/25/10 12:08 AM, Olli Pettay wrote:
> On 3/24/10 11:33 PM, Ian Hickson wrote:
>> On Sun, 21 Feb 2010, Olli Pettay wrote:
>>>
>>> I propose that bufferedAmount doesn't take account the bits added by the
>>> protocol. This way if the protocol is later changed, web developers
>>> don't need to change their code because of the way they rely on
>>> bufferedAmount.
>>
>> On Thu, 4 Mar 2010, Fumitoshi Ukai (?~\??~V~G?~U~O) wrote:
>>>
>>> I noticed that WebSocket spec updated to not inlcude framing overhead in
>>> bufferedAmount.
>>> http://lists.whatwg.org/pipermail/commit-watchers-whatwg.org/2010/003971.html
>>>
>>> I tried to implement it in WebKit, but found it make hard to implement
>>> correctly. https://bugs.webkit.org/show_bug.cgi?id=35571
>>> It's easy after WebSocket is closed (just add length of message), but
>>> while
>>> it's open, we'll manage buffer including frame bytes and underlying
>>> socket
>>> will write arbitrary length of the buffer (may not be on frame boundary)
>>> To get bufferdAmount correctly without framing overhead, we need to
>>> parse
>>> the buffer again. It's not light operation and it's challenge to make it
>>> effective.
>>> I think including frame overhead is much easier.
>>
>> On Thu, 4 Mar 2010, Olli Pettay wrote:
>>>
>>> Not hard at all in gecko's implementation (the patch is still waiting
>>> for a review and will be possibly updated to include the latest changes
>>> to the protocol before pushing to hg repo).
>>
>> On Fri, 5 Mar 2010, Alexey Proskuryakov wrote:
>>>
>>> I was going to mention this as the primary reason why frame bytes should
>>> be included. JavaScript code needs this information for flow control,
>>> and it's raw bytes that are sent over the tubes, not original message
>>> strings.
>>>
>>> Also, I think it's a layering violation. In WebKit, we'd have to queue
>>> unsent messages separately just to implement this quirk (see
>>> https://bugs.webkit.org/attachment.cgi?id=50093 for a proof of concept).
>>> It becomes very difficult to implement we decide to add size of data
>>> that an underlying network library buffers internally - which I think
>>> would be a reasonable thing to do.
>>>
>>>> Also why to have framing bytes and not the bytes related to http
>>>> handling?
>>>
>>> Nothing would change for engines or JS code if HTTP headers were counted
>>> in bufferedAmount. Since they are only sent when establishing a
>>> connection, adding a small constant at the beginning will make no
>>> difference to flow control. And the constant is going to be zero in
>>> practice, because the data will immediately go where we can't see it.
>>
>> On Fri, 5 Mar 2010, Alexey Proskuryakov wrote:
>>>
>>> My recollection is that feature was added as a result of discussions
>>> about implementing flow control. How else are you supposed to know that
>>> you're streaming too fast without modifying the server? Since WebSockets
>>> is a match for TCP/IP, and the latter provides ways to adaptively change
>>> data rate, it's natural that one expects the same from WebSockets.
>>
>> On Fri, 5 Mar 2010, Alexey Proskuryakov wrote:
>>>
>>> Yes, that's lots of work for something no one should care about, as you
>>> implied above. And that's work that makes the results slightly
>>> misleading,
>>> even if that's so slightly that it's not important in practice.
>>>
>>> Remembering frame offsets even after data has been serialized to a
>>> stream is
>>> an unusual requirement for networking code.
>>
>> On Fri, 5 Mar 2010, Olli Pettay wrote:
>>>
>>> From API perspective I do care. Web developers shouldn't need to know
>>> about the protocol, yet (s)he should be able to understand what
>>> bufferedAmount means.
>>
>> On Fri, 5 Mar 2010, Alexey Proskuryakov wrote:
>>>
>>> An explanation like "it's how much data is buffered to be sent over
>>> network" seems adequate to me.
>>
>> On Wed, 17 Mar 2010, Alexey Proskuryakov wrote:
>>>
>>> We have a suggested patch that implements the proposed new behavior for
>>> WebKit now, but I think that it adds unnecessary complexity, and puts
>>> limits on how we can refactor the code in the future. We need to
>>> remember frame boundaries for much longer, making it difficult to
>>> interface with general purpose networking code.
>>>
>>> I'd prefer sticking to the previously specified behavior.
>>
>> On Tue, 23 Mar 2010, Olli Pettay wrote:
>>>
>>> And I certainly prefer the current behavior, where the API is not so
>>> tightly bound to the protocol, and where the bufferedAmount is handled
>>> more close to what progress events do with XMLHttpRequest.
>>
>> On Tue, 23 Mar 2010, Anne van Kesteren wrote:
>>>
>>> We (Opera) would prefer this too. I.e. to not impose details of the
>>> protocol on the API.
>>
>> If we're exposing nothing from the protocol, does that mean we shouldn't
>> be exposing that the string converts to UTF-8 either?
>
> Yeah, I've been thinking about that too.
>
>
>>
>> I guess I'm unclear on whether bufferedAmount should return:
>>
>> 1. the sum of the count of characters sent?
>> (what would we do when we add binary?)
> I believe this is actually what we want.
> If web developer sends a string which is X long,
> bufferedAmount should report X.
>
> And when we add binary, if buffer which has size Y is
> sent, that Y is added to bufferedAmount.

Though, this is a bit ugly too.
Mixing 16bit and 8bit data...

One option is to remove bufferedAmount,
and have just a boolean flag
hasBufferedData.

Or better could be that the API spec says that WebSocket.send()
converts the data to UTF-8 and bufferedAmount
indicates how much UTF-8 data is buffered.
Then adding support for binary would be easy.
And that way it doesn't matter whether the protocol
actually sends the textual data as UTF-8 or as
something else.

This way web developer can still check what part of the
data is still buffered. (S)he just have to convert
UTF-16 to UTF-8 in JS, when needed.


-Olli


>
> The reason why I'd like it to work this way is that
> IMO scripts should be able to check whether the data
> they have posted is actually sent over the network.
>
>
> -Olli
>
>
>>
>> 2. the sum of bytes after conversion to UTF-8?
>>
>> 3. the sum of bytes yet to be sent on the wire?
>>
>> I'm not sure how to pick a solution here. It sounds like WebKit people
>> want 3, and Opera and Mozilla are asking for 2. Is that right? I guess
>> I'll go with 2 unless more people have opinions.
>>
>
>

Received on Thursday, 25 March 2010 02:21:10 UTC