W3C home > Mailing lists > Public > public-webapi@w3.org > May 2008

Re: Blobs: An alternate (complementary?) binary data proposal (Was: File IO...)

From: Maciej Stachowiak <mjs@apple.com>
Date: Sun, 11 May 2008 15:02:37 -0700
Cc: "Web API WG (public)" <public-webapi@w3.org>, Aaron Boodman <aa@google.com>, Ian Hickson <ian@hixie.ch>
Message-Id: <1841CBE6-2C8D-40C8-91C2-9EB6C4E70A2C@apple.com>
To: Chris Prince <cprince@google.com>

On May 10, 2008, at 11:39 PM, Chris Prince wrote:

> On Sat, May 10, 2008 at 1:18 AM, Maciej Stachowiak <mjs@apple.com>  
> wrote:
>> I'm not really clear on why Blobs must be distinct from ByteArrays.
>> The only explanation is: "The primary difference is that Blobs are
>> immutable*, and can therefore represent large objects." But I am not
>> sure why immutability is necessary to have the ability to represent
>> large objects. If you are thinking immutability is necessary to be
>> able to have large objects memory mapped from disk, then mmap with a
>> private copy-on-write mapping should solve that problem just fine.
> Making Blobs immutable simplifies a number of problems:
> (1) Asynchronous APIs.
> Large Blobs can be passed to XmlHttpRequest for an asynchronous POST,
> or to Database for an asynchronous INSERT.  If Blobs are mutable, the
> caller can modify the contents at any time.  The XmlHttpRequest or
> Database operation will be undefined.
> Careful callers could wait for the operation to finish (at least in
> these two examples; I'm not sure about all possible scenarios).  But
> this is starting to put quite a burden on developers.
> (2) HTML5 Workers.
> There are cases where apps will get a Blob on the UI thread, and then
> want to operate on it in a Worker.  Note that the Blob may be
> file-backed or memory-backed.
> Worker threads are isolated execution environments.  If Blobs are
> mutable, it seems like tricky (or impossible) gymnastics would be
> required to ensure one thread's file writes aren't seen by another
> thread's reads, unless you create a copy.  And that is doubly true for
> memory-backed blobs.
> (I'm not even considering older mobile operating systems, which may
> not have all the file and memory capabilities of modern OSes.)

Both of these can be addressed by the APIs (including the worker  
transfer mechanism) making a copy, which can use a copy-on-write  
mechanism to avoid actually making a copy in the common case.

It seems like immutability creates its own problems. If you have a  
large piece of binary data, say retrieved over the network from XHR,  
and the only way to change it is to make a copy, and you have multiple  
pieces of your code that want to change it, you are going to be  
allocating memory for many copies.

(I should add that I also find the name "Blob" distasteful in an API,  
but that is a minor poin).

I'm still not convinced that immutability is good, or that the  
ECMAScript ByteArray proposal can't handle the required use cases.

Received on Sunday, 11 May 2008 22:03:16 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:16:26 UTC