Re: Blobs: An alternate (complementary?) binary data proposal (Was: File IO...)

On May 10, 2008, at 11:39 PM, Chris Prince wrote:

>
> On Sat, May 10, 2008 at 1:18 AM, Maciej Stachowiak <mjs@apple.com>  
> wrote:
>> I'm not really clear on why Blobs must be distinct from ByteArrays.
>> The only explanation is: "The primary difference is that Blobs are
>> immutable*, and can therefore represent large objects." But I am not
>> sure why immutability is necessary to have the ability to represent
>> large objects. If you are thinking immutability is necessary to be
>> able to have large objects memory mapped from disk, then mmap with a
>> private copy-on-write mapping should solve that problem just fine.
>
> Making Blobs immutable simplifies a number of problems:
>
> (1) Asynchronous APIs.
>
> Large Blobs can be passed to XmlHttpRequest for an asynchronous POST,
> or to Database for an asynchronous INSERT.  If Blobs are mutable, the
> caller can modify the contents at any time.  The XmlHttpRequest or
> Database operation will be undefined.
>
> Careful callers could wait for the operation to finish (at least in
> these two examples; I'm not sure about all possible scenarios).  But
> this is starting to put quite a burden on developers.
>
> (2) HTML5 Workers.
>
> There are cases where apps will get a Blob on the UI thread, and then
> want to operate on it in a Worker.  Note that the Blob may be
> file-backed or memory-backed.
>
> Worker threads are isolated execution environments.  If Blobs are
> mutable, it seems like tricky (or impossible) gymnastics would be
> required to ensure one thread's file writes aren't seen by another
> thread's reads, unless you create a copy.  And that is doubly true for
> memory-backed blobs.
>
> (I'm not even considering older mobile operating systems, which may
> not have all the file and memory capabilities of modern OSes.)

Both of these can be addressed by the APIs (including the worker  
transfer mechanism) making a copy, which can use a copy-on-write  
mechanism to avoid actually making a copy in the common case.

It seems like immutability creates its own problems. If you have a  
large piece of binary data, say retrieved over the network from XHR,  
and the only way to change it is to make a copy, and you have multiple  
pieces of your code that want to change it, you are going to be  
allocating memory for many copies.

(I should add that I also find the name "Blob" distasteful in an API,  
but that is a minor poin).

I'm still not convinced that immutability is good, or that the  
ECMAScript ByteArray proposal can't handle the required use cases.

Regards,
Maciej

Received on Sunday, 11 May 2008 22:03:16 UTC