Lifetime of Blob URL from Jonas Sicking on 2010-07-29 (public-webapps@w3.org from July to September 2010)

From: Jonas Sicking <jonas@sicking.cc>
Date: Thu, 29 Jul 2010 16:33:51 -0700
To: Dmitry Titov <dimich@chromium.org>
Cc: David Levin <levin@google.com>, Adrian Bateman <adrianba@microsoft.com>, Darin Fisher <darin@chromium.org>, "arun@mozilla.com" <arun@mozilla.com>, Web Applications Working Group WG <public-webapps@w3.org>
Message-ID: <AANLkTikQvKaPODn5OkizRBaOL2mDSWjLazXqG=CstbiY@mail.gmail.com>
Sorry about the slow response. I'm currently at blackhat, so my
internet connectivity is somewhat... unreliable, so generally having
to try to stay off the webs :)

On Tue, Jul 27, 2010 at 1:16 PM, Dmitry Titov <dimich@chromium.org> wrote:
> Thanks Jonas,
> Just to clarify some details we had while discussing this, could you confirm
> if this matches with your thinking (or not):
> 1. If blob was created in window1, blob.url was queried, then passed (as JS
> object) to window2, and window1 was closed - then the url gets invalidated
> when window1 is closed, but immediately re-validated if window2 queries
> blob.url. The url string is going to be the same, only there will be a time
> interval between closing window1 and querying blob.url in window2, during
> which loading from the url returns 404.

Actually, it might make sense to make blob.url, when queried by
window2, return a different string. This makes things somewhat more
consistent as to when a URL is working an when not.

I.e. you're less likely to end up covering up a bug due to a URL
coming back to life because another page started using a blob whose
URL you were previously handed.

It's a somewhat unlikely scenario so I don't feel very strongly either way.

> 2. If blob is sent to a Worker via worker.postMessage(blob), the 'structured
> clone' mechanism is used, so on the other side of postMessage a new blob
> object is created, backed by the same data, but having its own unique
> blob.url string (and separate lifetime).

Yes.

/ Jonas

> On Mon, Jul 26, 2010 at 2:12 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>> On Tue, Jul 13, 2010 at 7:37 AM, David Levin <levin@google.com> wrote:
>> > On Tue, Jul 13, 2010 at 6:50 AM, Adrian Bateman <adrianba@microsoft.com>
>> > wrote:
>> >>
>> >> On Monday, July 12, 2010 2:31 PM, Darin Fisher wrote:
>> >> > On Mon, Jul 12, 2010 at 9:59 AM, David Levin <levin@google.com>
>> >> > wrote:
>> >> > On Mon, Jul 12, 2010 at 9:54 AM, Adrian Bateman
>> >> > <adrianba@microsoft.com>
>> >> > wrote:
>> >> > I read point #5 to be only about surviving the start of a navigation.
>> >> > As
>> >> > a
>> >> > web developer, how can I tell when a load has started for an <img>?
>> >> > Isn't
>> >> > this similarly indeterminate.
>> >> >
>> >> > As soon as img.src is set.
>> >> >
>> >> > "the spec could mention that the resource pointed by blob URL should
>> >> > be
>> >> > loaded successfully as long as the blob URL is valid at the time when
>> >> > the
>> >> > resource is starting to load."
>> >> >
>> >> > Should apply to xhr (after send is called), img, and navigation.
>> >> >
>> >> > Right, it seems reasonable to say that ownership of the resource
>> >> > referenced
>> >> > by a Blob can be shared by a XHR, Image, or navigation once it is
>> >> > told
>> >> > to
>> >> > start loading the resource.
>> >> >
>> >> > -Darin
>> >>
>> >> It sounds like you are saying the following is guaranteed to work:
>> >>
>> >> img.src = blob.url;
>> >> window.revokeBlobUrl(blob);
>> >> return;
>> >>
>> >> If that is the case then the user agent is already making the
>> >> guarantees
>> >> I was talking about and so I still think having the lifetime mapped to
>> >> the
>> >> blob
>> >> not the document is better. This means that in the general case I don't
>> >> have
>> >> to worry about lifetime management.
>> >
>> > Mapping lifetime to the blob exposes when the blob gets garbage
>> > collected
>> > which is a very indeterminate point in time (and is very browser version
>> > dependent -- it will set you up for compatibility issues when you update
>> > your javascript engine -- and there are also the cross browser issues of
>> > course).
>> > Specifically, a blob could go "out of scope" (to use your earlier
>> > phrase)
>> > and then one could do img.src = blobUrl (the url that was exposed from
>> > the
>> > blob but not using the blob object). This will work sometimes but not
>> > others
>> > (depending on whether garbage collection collected the blob).
>> > This is much more indeterminate than the current spec which maps the
>> > blob.url lifetime to the lifetime of the document where the blob was
>> > created.
>> > When thinking about blob.url lifetime, there are several problems to
>> > solve:
>> > 1. "An AJAX style web application may never navigate the document and
>> > this
>> > means that every blob for which a URL is created must be kept around in
>> > some
>> > form for the lifetime of the application."
>> > 2. A blob passed to between documents would have its blob.url stop
>> > working
>> > as soon as the original document got closed.
>> > 3. Having a model that makes the url have a determinate lifetime which
>> > doesn't expose the web developer to indeterminate behaviors issues like
>> > we
>> > have discussed above.
>> > The current spec has issues #1 and #2.
>> > Binding the lifetime of blob.url to blob has issue #3.
>>
>> Indeed.
>>
>> I agree with others that have said that exposing GC behavior is a big
>> problem. I think especially here where a very natural usage pattern is
>> to grab a File object, extract its url, and then drop the reference to
>> the File object on the floor.
>>
>> And I don't think specifying how GC is supposed to work is a workable
>> solution. I doubt that any browser vendor will be willing to lock down
>> their GC to that degree. GC implementations is a very active area of
>> experimentation and has been for many many years. I see no reason to
>> think that we'd be able to come up with a GC algorithm that wouldn't
>> be obsolete very soon.
>>
>> However I also don't think #3 above is a huge problem. You can always
>> flush a blob to disk, meaning that all that is leaked is an entry in a
>> url->filename hash table. No actual data needs to be kept in memory.
>> It's definitely still a problem, but I figured it's worth pointing
>> out.
>>
>> Given that, I see no other significantly different solution than what
>> is in the spec right now. Though there are definitely some problems
>> that we should fix:
>>
>> 1. Add a function for "destroying" a url reference seems like a good idea.
>> 2. #2 above can be specced away. You simply need to specify that any
>> context that calls blob.url extends the lifetime such that the url
>> isn't automatically destroyed until all contexts that requested it are
>> destroyed.
>> 3. We should define that worker scopes can also extract blob urls.
>>
>> However this leaves deciding on what syntax to use for creating and
>> destroying URLs. The current method of obtaining a url is:
>>
>> x = myfile.url;
>> we could simply add
>> myfile.killUrl();
>>
>> which kills the url that was previously returned from the file.
>> However this requires that people hold on to the Blob object and so
>> seems like a suboptimal solution. We could also do
>>
>> x = myfile.url;
>> we could simply add
>> window.destroyBlobUrl(x);
>>
>> However this keeps the creator and destructor functions far from each
>> other, which IMHO isn't very nice.
>>
>> It has also been suggested that we change the syntax for obtaining urls
>> to:
>>
>> x = window.createBlobUrl(myfile);
>> and
>> window.destroyBlobUrl(x);
>>
>> however the myfile.url syntax feels really nice and would be
>> unfortunate to loose. Instead I propose the following syntax:
>>
>> x = myfile.url;
>> and
>> Blob.destroyUrl(x);
>> File.destroyUrl(x);
>>
>> ECMAScript already puts functions on constructor objects, so we'd not
>> be inventing anything new here. For example array1.concat(array2) is
>> equivalent to Array.concat(array1, array2).
>>
>> This is what I propose we use. I'm definitely interested to hear what
>> other people think though.
>>
>> / Jonas
>>
>
>
Received on Thursday, 29 July 2010 23:34:27 UTC