Re: Lifetime of Blob URL from Michael Nordman on 2010-07-30 (public-webapps@w3.org from July to September 2010)

From: Michael Nordman <michaeln@google.com>
Date: Fri, 30 Jul 2010 12:01:12 -0700
To: Jonas Sicking <jonas@sicking.cc>
Cc: Dmitry Titov <dimich@chromium.org>, David Levin <levin@google.com>, Adrian Bateman <adrianba@microsoft.com>, Darin Fisher <darin@chromium.org>, "arun@mozilla.com" <arun@mozilla.com>, Web Applications Working Group WG <public-webapps@w3.org>
Message-ID: <AANLkTi=6iHfHR5O1L92OKvUZT-ufgw+Qzxso=sDEmEsk@mail.gmail.com>
On Thu, Jul 29, 2010 at 4:33 PM, Jonas Sicking <jonas@sicking.cc> wrote:

> Sorry about the slow response. I'm currently at blackhat, so my
> internet connectivity is somewhat... unreliable, so generally having
> to try to stay off the webs :)
>
> On Tue, Jul 27, 2010 at 1:16 PM, Dmitry Titov <dimich@chromium.org> wrote:
> > Thanks Jonas,
> > Just to clarify some details we had while discussing this, could you
> confirm
> > if this matches with your thinking (or not):
> > 1. If blob was created in window1, blob.url was queried, then passed (as
> JS
> > object) to window2, and window1 was closed - then the url gets
> invalidated
> > when window1 is closed, but immediately re-validated if window2 queries
> > blob.url. The url string is going to be the same, only there will be a
> time
> > interval between closing window1 and querying blob.url in window2, during
> > which loading from the url returns 404.
>
> Actually, it might make sense to make blob.url, when queried by
> window2, return a different string. This makes things somewhat more
> consistent as to when a URL is working an when not.
>

Now suppose window2 queries the .url attribute before window1 is closed? I
think most people would expect the same value as returned in window1 (yes?).
Having the same or different value depending on whether the attribute was
queried before or after another window was closed seems confusing. I think
having the .url remain consistent from frame to frame/window to window could
help with debugging.

Without fully understanding the gap between blob and .url life time, some
folks are going to be mystified by when/why a url value stops working, or
why the .url value is sometimes different than it was before. There are some
pretty hidden side affect of accessing that attribute in a particular frame.
These subtle oddities with the .url attribute are in part
what originally motivated the proposal to make it more explicit.

We're trying to make blob.url easy and natural feeling, but with too many
caveats, will it be?

I guess that's a long winded vote for resurrecting the same url value used
in window1 in the particular case Dmitry raised.



>
> I.e. you're less likely to end up covering up a bug due to a URL
> coming back to life because another page started using a blob whose
> URL you were previously handed.
>
> It's a somewhat unlikely scenario so I don't feel very strongly either way.
>
> > 2. If blob is sent to a Worker via worker.postMessage(blob), the
> 'structured
> > clone' mechanism is used, so on the other side of postMessage a new blob
> > object is created, backed by the same data, but having its own unique
> > blob.url string (and separate lifetime).
>
> Yes.
>
> / Jonas
>
> > On Mon, Jul 26, 2010 at 2:12 PM, Jonas Sicking <jonas@sicking.cc> wrote:
> >>
> >> On Tue, Jul 13, 2010 at 7:37 AM, David Levin <levin@google.com> wrote:
> >> > On Tue, Jul 13, 2010 at 6:50 AM, Adrian Bateman <
> adrianba@microsoft.com>
> >> > wrote:
> >> >>
> >> >> On Monday, July 12, 2010 2:31 PM, Darin Fisher wrote:
> >> >> > On Mon, Jul 12, 2010 at 9:59 AM, David Levin <levin@google.com>
> >> >> > wrote:
> >> >> > On Mon, Jul 12, 2010 at 9:54 AM, Adrian Bateman
> >> >> > <adrianba@microsoft.com>
> >> >> > wrote:
> >> >> > I read point #5 to be only about surviving the start of a
> navigation.
> >> >> > As
> >> >> > a
> >> >> > web developer, how can I tell when a load has started for an <img>?
> >> >> > Isn't
> >> >> > this similarly indeterminate.
> >> >> >
> >> >> > As soon as img.src is set.
> >> >> >
> >> >> > "the spec could mention that the resource pointed by blob URL
> should
> >> >> > be
> >> >> > loaded successfully as long as the blob URL is valid at the time
> when
> >> >> > the
> >> >> > resource is starting to load."
> >> >> >
> >> >> > Should apply to xhr (after send is called), img, and navigation.
> >> >> >
> >> >> > Right, it seems reasonable to say that ownership of the resource
> >> >> > referenced
> >> >> > by a Blob can be shared by a XHR, Image, or navigation once it is
> >> >> > told
> >> >> > to
> >> >> > start loading the resource.
> >> >> >
> >> >> > -Darin
> >> >>
> >> >> It sounds like you are saying the following is guaranteed to work:
> >> >>
> >> >> img.src = blob.url;
> >> >> window.revokeBlobUrl(blob);
> >> >> return;
> >> >>
> >> >> If that is the case then the user agent is already making the
> >> >> guarantees
> >> >> I was talking about and so I still think having the lifetime mapped
> to
> >> >> the
> >> >> blob
> >> >> not the document is better. This means that in the general case I
> don't
> >> >> have
> >> >> to worry about lifetime management.
> >> >
> >> > Mapping lifetime to the blob exposes when the blob gets garbage
> >> > collected
> >> > which is a very indeterminate point in time (and is very browser
> version
> >> > dependent -- it will set you up for compatibility issues when you
> update
> >> > your javascript engine -- and there are also the cross browser issues
> of
> >> > course).
> >> > Specifically, a blob could go "out of scope" (to use your earlier
> >> > phrase)
> >> > and then one could do img.src = blobUrl (the url that was exposed from
> >> > the
> >> > blob but not using the blob object). This will work sometimes but not
> >> > others
> >> > (depending on whether garbage collection collected the blob).
> >> > This is much more indeterminate than the current spec which maps the
> >> > blob.url lifetime to the lifetime of the document where the blob was
> >> > created.
> >> > When thinking about blob.url lifetime, there are several problems to
> >> > solve:
> >> > 1. "An AJAX style web application may never navigate the document and
> >> > this
> >> > means that every blob for which a URL is created must be kept around
> in
> >> > some
> >> > form for the lifetime of the application."
> >> > 2. A blob passed to between documents would have its blob.url stop
> >> > working
> >> > as soon as the original document got closed.
> >> > 3. Having a model that makes the url have a determinate lifetime which
> >> > doesn't expose the web developer to indeterminate behaviors issues
> like
> >> > we
> >> > have discussed above.
> >> > The current spec has issues #1 and #2.
> >> > Binding the lifetime of blob.url to blob has issue #3.
> >>
> >> Indeed.
> >>
> >> I agree with others that have said that exposing GC behavior is a big
> >> problem. I think especially here where a very natural usage pattern is
> >> to grab a File object, extract its url, and then drop the reference to
> >> the File object on the floor.
> >>
> >> And I don't think specifying how GC is supposed to work is a workable
> >> solution. I doubt that any browser vendor will be willing to lock down
> >> their GC to that degree. GC implementations is a very active area of
> >> experimentation and has been for many many years. I see no reason to
> >> think that we'd be able to come up with a GC algorithm that wouldn't
> >> be obsolete very soon.
> >>
> >> However I also don't think #3 above is a huge problem. You can always
> >> flush a blob to disk, meaning that all that is leaked is an entry in a
> >> url->filename hash table. No actual data needs to be kept in memory.
> >> It's definitely still a problem, but I figured it's worth pointing
> >> out.
> >>
> >> Given that, I see no other significantly different solution than what
> >> is in the spec right now. Though there are definitely some problems
> >> that we should fix:
> >>
> >> 1. Add a function for "destroying" a url reference seems like a good
> idea.
> >> 2. #2 above can be specced away. You simply need to specify that any
> >> context that calls blob.url extends the lifetime such that the url
> >> isn't automatically destroyed until all contexts that requested it are
> >> destroyed.
> >> 3. We should define that worker scopes can also extract blob urls.
> >>
> >> However this leaves deciding on what syntax to use for creating and
> >> destroying URLs. The current method of obtaining a url is:
> >>
> >> x = myfile.url;
> >> we could simply add
> >> myfile.killUrl();
> >>
> >> which kills the url that was previously returned from the file.
> >> However this requires that people hold on to the Blob object and so
> >> seems like a suboptimal solution. We could also do
> >>
> >> x = myfile.url;
> >> we could simply add
> >> window.destroyBlobUrl(x);
> >>
> >> However this keeps the creator and destructor functions far from each
> >> other, which IMHO isn't very nice.
> >>
> >> It has also been suggested that we change the syntax for obtaining urls
> >> to:
> >>
> >> x = window.createBlobUrl(myfile);
> >> and
> >> window.destroyBlobUrl(x);
> >>
> >> however the myfile.url syntax feels really nice and would be
> >> unfortunate to loose. Instead I propose the following syntax:
> >>
> >> x = myfile.url;
> >> and
> >> Blob.destroyUrl(x);
> >> File.destroyUrl(x);
> >>
> >> ECMAScript already puts functions on constructor objects, so we'd not
> >> be inventing anything new here. For example array1.concat(array2) is
> >> equivalent to Array.concat(array1, array2).
> >>
> >> This is what I propose we use. I'm definitely interested to hear what
> >> other people think though.
> >>
> >> / Jonas
> >>
> >
> >
>
>
Received on Friday, 30 July 2010 19:01:45 UTC