Re: [FileAPI] BlobBuilder.getBlob should clear the BlobBuilder from Jonas Sicking on 2011-04-13 (public-webapps@w3.org from April to June 2011)

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 12 Apr 2011 23:46:20 -0700
To: Eric Uhrhane <ericu@google.com>
Cc: Kyle Huey <me@kylehuey.com>, Web Applications Working Group WG <public-webapps@w3.org>
Message-ID: <BANLkTiko7s+tZCEUrKih0801TAswO3j2pg@mail.gmail.com>

On Tue, Apr 12, 2011 at 5:33 PM, Eric Uhrhane <ericu@google.com> wrote:
> On Tue, Apr 12, 2011 at 3:38 PM, Kyle Huey <me@kylehuey.com> wrote:
>> Hello All,
>>
>> In the current FileAPI Writer spec a BlobBuilder can be used to build a
>> series of blobs like so:
>>
>>   var bb = BlobBuilder();
>>   bb.append("foo");
>>   var foo = bb.getBlob();
>>   bb.append("bar");
>>   var bar = bb.getBlob();
>>   foo.size; // == 3
>>   bar.size; // == 6
>>
>> My concern with this pattern is that it seems that one of the primary use
>> cases is to keep a BlobBuilder around for a while to build up a blob over
>> time.  A BlobBuilder left around could potentially entrain large amounts of
>> memory.  I propose that BlobBuilder.getBlob() "clears" the BlobBuilder,
>> returning it to an empty state.  The current behavior also doesn't seem
>> terribly useful to me (though I'm happy to be convinced otherwise) and be
>> easily replicated on top of the proposed behavior (immediately reappending
>> the Blob that was just retrieved.)
>>
>> Thoughts/comments?
>>
>> - Kyle
>
> If you don't have a use for a current behavior, you can always just
> drop the BlobBuilder as soon as you're done with it, and it'll get
> collected.  I think that's simpler and more intuitive than having it
> clear itself, which is a surprise in an operation that looks
> read-only.  In the other case, where you actually want the append
> behavior, it's faster and simpler not to have to re-append a blob
> you've just pulled out of it.

The problem is that this optimizes for the rare case when you're
creating several blobs which are prefixes of each other.

It's not at all rare for pages to inadvertently hold on to objects
longer than they need. This bogs down both the users machine and
webpage. Yes, pages can "fix" this by dropping all the references to
an object and wait for GC, but it's all too common mistake not to do
this.

If we think that people will use BlobBuilder to create large blobs,
then it's better to have explicit API for dropping that rather than
relying on GC. Here we additionally have the advantage that we
wouldn't risk people forgetting to use the explicit API since that is
the same API as dropping the data.

Another advantage of dropping the memory automatically is that you
don't need to copy any data into the Blob. Instead you can just make
the Blob take ownership of whatever memory buffers you've built up
during the various calls to .append. You could technically implement
some sort of copy-on-write scheme, but that introduces complexity.

Flip it around, what is the argument for keeping the memory owned by
the BlobBuilder? If it's just that the name looks read-only, I'd be
fine with renaming the extraction-function to something else.

/ Jonas

Received on Wednesday, 13 April 2011 06:47:20 UTC