Re: Lifetime of Blob URL from Dmitry Titov on 2010-07-27 (public-webapps@w3.org from July to September 2010)

From: Dmitry Titov <dimich@chromium.org>
Date: Tue, 27 Jul 2010 13:16:53 -0700
To: Jonas Sicking <jonas@sicking.cc>
Cc: David Levin <levin@google.com>, Adrian Bateman <adrianba@microsoft.com>, Darin Fisher <darin@chromium.org>, "arun@mozilla.com" <arun@mozilla.com>, Web Applications Working Group WG <public-webapps@w3.org>
Message-ID: <AANLkTi=H5cavJZfSe+1Yms3uQjqSPLaRsAmqLKXb1N4V@mail.gmail.com>
Thanks Jonas,

Just to clarify some details we had while discussing this, could you confirm
if this matches with your thinking (or not):

1. If blob was created in window1, blob.url was queried, then passed (as JS
object) to window2, and window1 was closed - then the url gets invalidated
when window1 is closed, but immediately re-validated if window2 queries
blob.url. The url string is going to be the same, only there will be a time
interval between closing window1 and querying blob.url in window2, during
which loading from the url returns 404.

2. If blob is sent to a Worker via worker.postMessage(blob), the 'structured
clone' mechanism is used, so on the other side of postMessage a new blob
object is created, backed by the same data, but having its own unique
blob.url string (and separate lifetime).

Dmitry


On Mon, Jul 26, 2010 at 2:12 PM, Jonas Sicking <jonas@sicking.cc> wrote:

> On Tue, Jul 13, 2010 at 7:37 AM, David Levin <levin@google.com> wrote:
> > On Tue, Jul 13, 2010 at 6:50 AM, Adrian Bateman <adrianba@microsoft.com>
> > wrote:
> >>
> >> On Monday, July 12, 2010 2:31 PM, Darin Fisher wrote:
> >> > On Mon, Jul 12, 2010 at 9:59 AM, David Levin <levin@google.com>
> wrote:
> >> > On Mon, Jul 12, 2010 at 9:54 AM, Adrian Bateman <
> adrianba@microsoft.com>
> >> > wrote:
> >> > I read point #5 to be only about surviving the start of a navigation.
> As
> >> > a
> >> > web developer, how can I tell when a load has started for an <img>?
> >> > Isn't
> >> > this similarly indeterminate.
> >> >
> >> > As soon as img.src is set.
> >> >
> >> > "the spec could mention that the resource pointed by blob URL should
> be
> >> > loaded successfully as long as the blob URL is valid at the time when
> >> > the
> >> > resource is starting to load."
> >> >
> >> > Should apply to xhr (after send is called), img, and navigation.
> >> >
> >> > Right, it seems reasonable to say that ownership of the resource
> >> > referenced
> >> > by a Blob can be shared by a XHR, Image, or navigation once it is told
> >> > to
> >> > start loading the resource.
> >> >
> >> > -Darin
> >>
> >> It sounds like you are saying the following is guaranteed to work:
> >>
> >> img.src = blob.url;
> >> window.revokeBlobUrl(blob);
> >> return;
> >>
> >> If that is the case then the user agent is already making the guarantees
> >> I was talking about and so I still think having the lifetime mapped to
> the
> >> blob
> >> not the document is better. This means that in the general case I don't
> >> have
> >> to worry about lifetime management.
> >
> > Mapping lifetime to the blob exposes when the blob gets garbage collected
> > which is a very indeterminate point in time (and is very browser version
> > dependent -- it will set you up for compatibility issues when you update
> > your javascript engine -- and there are also the cross browser issues of
> > course).
> > Specifically, a blob could go "out of scope" (to use your earlier phrase)
> > and then one could do img.src = blobUrl (the url that was exposed from
> the
> > blob but not using the blob object). This will work sometimes but not
> others
> > (depending on whether garbage collection collected the blob).
> > This is much more indeterminate than the current spec which maps the
> > blob.url lifetime to the lifetime of the document where the blob was
> > created.
> > When thinking about blob.url lifetime, there are several problems to
> solve:
> > 1. "An AJAX style web application may never navigate the document and
> this
> > means that every blob for which a URL is created must be kept around in
> some
> > form for the lifetime of the application."
> > 2. A blob passed to between documents would have its blob.url stop
> working
> > as soon as the original document got closed.
> > 3. Having a model that makes the url have a determinate lifetime which
> > doesn't expose the web developer to indeterminate behaviors issues like
> we
> > have discussed above.
> > The current spec has issues #1 and #2.
> > Binding the lifetime of blob.url to blob has issue #3.
>
> Indeed.
>
> I agree with others that have said that exposing GC behavior is a big
> problem. I think especially here where a very natural usage pattern is
> to grab a File object, extract its url, and then drop the reference to
> the File object on the floor.
>
> And I don't think specifying how GC is supposed to work is a workable
> solution. I doubt that any browser vendor will be willing to lock down
> their GC to that degree. GC implementations is a very active area of
> experimentation and has been for many many years. I see no reason to
> think that we'd be able to come up with a GC algorithm that wouldn't
> be obsolete very soon.
>
> However I also don't think #3 above is a huge problem. You can always
> flush a blob to disk, meaning that all that is leaked is an entry in a
> url->filename hash table. No actual data needs to be kept in memory.
> It's definitely still a problem, but I figured it's worth pointing
> out.
>
> Given that, I see no other significantly different solution than what
> is in the spec right now. Though there are definitely some problems
> that we should fix:
>
> 1. Add a function for "destroying" a url reference seems like a good idea.
> 2. #2 above can be specced away. You simply need to specify that any
> context that calls blob.url extends the lifetime such that the url
> isn't automatically destroyed until all contexts that requested it are
> destroyed.
> 3. We should define that worker scopes can also extract blob urls.
>
> However this leaves deciding on what syntax to use for creating and
> destroying URLs. The current method of obtaining a url is:
>
> x = myfile.url;
> we could simply add
> myfile.killUrl();
>
> which kills the url that was previously returned from the file.
> However this requires that people hold on to the Blob object and so
> seems like a suboptimal solution. We could also do
>
> x = myfile.url;
> we could simply add
> window.destroyBlobUrl(x);
>
> However this keeps the creator and destructor functions far from each
> other, which IMHO isn't very nice.
>
> It has also been suggested that we change the syntax for obtaining urls to:
>
> x = window.createBlobUrl(myfile);
> and
> window.destroyBlobUrl(x);
>
> however the myfile.url syntax feels really nice and would be
> unfortunate to loose. Instead I propose the following syntax:
>
> x = myfile.url;
> and
> Blob.destroyUrl(x);
> File.destroyUrl(x);
>
> ECMAScript already puts functions on constructor objects, so we'd not
> be inventing anything new here. For example array1.concat(array2) is
> equivalent to Array.concat(array1, array2).
>
> This is what I propose we use. I'm definitely interested to hear what
> other people think though.
>
> / Jonas
>
>
Received on Tuesday, 27 July 2010 20:17:32 UTC