Re: [WebSocket API] .binaryType

On Thu, 1 Sep 2011, Jonas Sicking wrote:
> 
> 1. It doesn't allow receiving textual messages as Blobs, only binary
> messages can be stored as blobs.

Huh. I never considered that people might want to have the server send 
stuff as text but still want to treat it as a Blob on the receiving side.

Can't the script just wrap it in a blob instead of asking the browser to 
provide it in blob form?

We can certainly support this case if there is a need, of course.

The reason we added binaryType was that we expect binary frames to in some 
cases be huge and for scripts to not need to see the precise bits, e.g. 
because it's an image to be stuffed in an <img> or whatnot.

Also, there is a concern with text frames that if we make it possible to 
not get the text, a simple bug where the incoming frame data is switched 
to not-text and not switched back would essentially kill the connection's 
logic. This is also a concern with binary frames, but the assumption is 
that most authors will send their control messages as text, so that having 
incoming binary data in the wrong format will just cause comparatively 
minor problems, not kill the connection's logic.


> 2. Since the binaryType can be changed at any time, this makes 
> implementation somewhat complex and have slower performance.

Yes. This is a necessary result of the client not having advance warning 
of what the server is going to send until the client has already started 
receiving the bytes from the binary frame (the text frame preceding the 
binary frame and warning the client about what is about to arrive will 
typically come in the same TCP packet as the start of the binary frame).


> Regarding 1:
> The usecase for blobs in general would be downloading large messages
> where the contents of the message won't be immediately used, but
> rather stored for later use. For example a offline webmail page could
> use websocket to synchronize emails and their attachments. Here the
> receiving page would want to receive the attachments as Blobs to be
> stored in IndexedDB or the Filesystem API. As the WebSocket API spec
> is currently defined, this only works if the attachments are sent in a
> binary format. So for example HTML attachments can't be sent as text.
> 
> I've never quite understood why only binary messages fulfills the
> property of "being large and will only be used later".

Why wouldn't you just store the attachment as text? Why would the server 
send some attachments with one frame type and others with another frame 
type? I don't really understand why the server wouldn't just use binary 
for everything here, especially since the client side is going to treat 
them all the same anyway.


> Regarding 2:
> Since .binaryType can be changed any time, you can end up in the
> situation where the implementation has streamed the incoming data to
> disk, and at the last second the binaryType changes from "blob" to
> "arraybuffer". At this point the implementation will have to start
> reading back the data from disk and stall any message events until the
> data has been fully read back. However before firing the event the
> implementation will have to check that .binaryType hasn't changed
> again, and if it has possibly stream the data back to disk.
> 
> There are two changes that I propose in order to fix these two issues:
> 
> 1. Change .binaryType to .dataAsBlob. dataAsBlob would be a boolean
> which if set to true would make data be delivered as a Blob rather
> than a String or ArrayBuffer. Another alternative would be have a
> .nextAsBlob property. nextAsBlob would work like dataAsBlob, but is
> set to false right before firing each message event (but of course
> after constructing the Event object with the actual data).

The theory behind the current design is that you want to default to blobs 
for binary frames, because that way clients that aren't expecting binary 
frames yet get them anyway will not waste memory.


> 2. Allow .binaryType/.dataAsBlob only during message and open events.
> This would allow the implementation to for example hold incoming data
> in memory until after the previous message event has been fully
> dispatched. This way no unneccesary IO is done. Under normal
> circumstances (where the bandwidth to the server is proportional to
> todays bandwidths, and where developers try to maintain a responsive
> UI) this implementation strategy will work quite well. More complex
> implementation strategies can of course still be used, but in either
> case the implementation doesn't have to worry about the format
> changing once the previous event has been dispatched.

I'm happy to limit binaryType to only being changed when a message is 
received, but in some simple situations that seems like it would make the 
API needlessly complex. Consider a server that just never sends anything 
unsolicited, paired with a client that never has two pending requests at a 
time -- it always waits for the previous reply before asking for something 
else. This would not be a sophisticated app, but it nevertheless is likely 
to describe many typical deployments. In such cases, the client knows what 
it wants to do with the next frame as soon as it asks the server for it, 
and it will not receive a frame between sending the request and receiving 
the frame for which it wants the data in a particular format. The author 
could have the server send a blank text frame just before the binary frame 
just so that the client could set binaryType accordingly, but that would 
obviously just be the author working around an API limitation.


(If the protocol was still developed in tight coordination with the API, I 
might propose that we have four frame types -- text, text blob, binary 
arraybuffer, and binary blob.)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Saturday, 3 September 2011 19:33:11 UTC