Re: Using ArrayBuffer as payload for binary data to/from Web Workers

On Tue, May 31, 2011 at 11:33 AM, Travis Leithead
<Travis.Leithead@microsoft.com> wrote:
>> > > The editors' draft of the typed array spec has been updated with a
>> > > strawman proposal for this zero-copy, transfer-of-ownership behavior:
>> > >
>> > > http://www.khronos.org/registry/typedarray/specs/latest/
>> > >
>> > > Feedback would be greatly appreciated. For the purposes of keeping the
>> > > conversation centralized, it might be helpful if we could use the
>> > > public_webgl list; see
>> > > https://www.khronos.org/webgl/public-mailing-list/ .
>> >
>> > While I see the need for this, i think it will be very surprising to
>> > authors that for all other data, postMessage is purely a read-only
>> > action. However for ArrayBuffers it would not be. There are two ways
>> > we can improve this situation:
>> >
>> > 1. Add a separate method next to postMessage which has the prescribed
>> > functionality. This also has the advantage that it lets users choose
>> > if they want the transfer-ownership functionality or not, for example
>> > for cases when performance isn't as big requirement, and when
>> > ArrayBuffers are small enough that the transferring ownership logic
>> > adds more overhead than memory copying logic would.
>> >
>> > 2. Add a separate argument to postMessage, similar to the 'ports'
>> > argument, which contains a list of array buffers whose ownership
>> > should be transferred.
>> >
>>
>> Riffing off idea #2, the second argument could be an array of objects who's
>> ownership should be transferred. For now only ArrayBuffers would be legal
>> objects but at some point in the future other types of objects could be
>> added (not sure what those objects would be but that's a much more flexible
>> interface than #1. You can chose to copy some ArrayBuffers and transfer
>> others.
>
> I tend to agree with Jonas on this one-having an ArrayBuffer stop working on either the primary document or a web worker after posting it seems like a bad developer experience by default. Having an opt-in transfer of ownership seems like a better idea, though I don't like special-casing ArrayBuffers, as I'd probably want to do this for large ImageData objects as well (with their associated CanvasPixelArrays).
>
> After discussing this a bit internally, we raised four major arguments against default transfer of ownership for TypedArrays:
>
> 1. User complexity:  Transfer of ownership is more complicated for developers, and does not fit with the silent and unobtrusive model of cloning that is typical of SCA. An operation that makes the object unusable should be explicit, not implicit.
>
> 2. SCA behavior for TypedArrays should align with Blob, etc.:  Using transfer of ownership for one and not the other will lead to user confusion. Generally, SCA has not embraced transfer of ownership as the user model, and we don't believe TypedArrays should default to this very different clone behaviour.
>
> 3. Cross-thread assumption: The idea of transfer of ownership semantics strongly suggests usage of cloning across threads.  SCA is used in many other scenarios where transfer of ownership does not add value, but does hurt usability. For example, Workers implemented to run in another process (a Web Worker implementation detail), IndexedDb (long-term storage of a SCA object graph), etc. In the case of IndexedDb for example, transfer of ownership has an unexpected semantic, as the database itself doesn't have a notion of ownership.
>
> 4. Split definitions of SCA algorithm: Having the specification of SCA behaviour for TypedArrays be separate than the definition in the HTML5 spec is likely to lead to continued divergence of the SCA algorithm.  It would be better to define this in one spec (i.e., in HTML5)

I agree that it would be better to generalize the transfer of
ownership mechanism to support more types in the future. The original
motivation for making transfer of ownership the behavior for Typed
Arrays under structured clone and/or postMessage was solely to
minimize changes to the HTML spec; it would be better to come up with
a more general solution.

Jonas's suggestion of adding another argument to postMessage, and
Gregg's generalization to declare it as an array of objects to be
transferred rather than copied, sounds good. Adding a transferMessage
API doesn't sound as good since it will require larger code changes to
take advantage of it, and is less flexible. I'll investigate updating
the typed array proposals in this direction.

Transfer of ownership of buffers is valuable even when Workers are
implemented in another process. The first time an ArrayBuffer is
posted from a Worker to the primary document, its storage could be
promoted (as an implementation detail) to shared memory. Once the
document is done with the data, it would post the buffer back to the
worker for re-filling. Subsequent ping-ponging would not involve any
further data copies. This is the primary goal of these typed array
spec updates: to enable efficient producer-consumer queues between
workers and the document.

Note that the current Typed Array strawman proposals are specifically
written to not affect IndexedDB, pushState, or any other HTML5 API
using the structured clone algorithm: the only affected consumer is
postMessage. The intent is to request that any proposals in the typed
array spec be folded into the appropriate sections of the HTML5 spec
once they are fully baked.

-Ken

Received on Tuesday, 31 May 2011 22:05:52 UTC