Re: Blob URLs | autoRevoke, defaults, and resolutions from Eric U on 2013-05-02 (public-webapps@w3.org from April to June 2013)

From: Eric U <ericu@google.com>
Date: Wed, 1 May 2013 17:01:11 -0700
To: Jonas Sicking <jonas@sicking.cc>
Cc: Arun Ranganathan <arun@mozilla.com>, Web Applications Working Group WG <public-webapps@w3.org>
Message-ID: <CAHvSExdr+5prObK4=KRfw26=oHw0NyGWbpaMy1Moh+mTLQ1A8A@mail.gmail.com>
On Wed, May 1, 2013 at 4:53 PM, Jonas Sicking <jonas@sicking.cc> wrote:
> On Wed, May 1, 2013 at 4:25 PM, Eric U <ericu@google.com> wrote:
>> On Wed, May 1, 2013 at 3:36 PM, Arun Ranganathan <arun@mozilla.com> wrote:
>>> Switching the default to "false" would enable IE, Chrome, andFirefox to have interoperability with URL.createObjectURL(blobArg), though such a default places burdens on web developers to couple create* calls with revoke* calls to not leak Blobs.  Jonas proposes a separate method, URL.createAutoRevokeObjectURL, which creates an autoRevoke URL.  I'm lukewarm on that :-\
>>
>> I'd support a new method with a different default, if we could figure
>> out a reasonable thing for that new method to do.
>
> Yeah, the if-condition here is quite important.
>
> But if we can figure out this problem, then my proposal would be to
> add a new method which has a "nicer" name than createObjectURL as to
> encourage authors to use that and have fewer leaks.

Heh; I wasn't even going to mention the name.

>>> 2. Regardless of the default, there's the hard question of what to do with Blob URL revocation.  Glenn / zewt points out that this applies, though perhaps less dramatically, to *manually* revoked Blob URLs, and provides some test cases [3].
>>>
>>> Options are:
>>>
>>> 2a. To meticulously special-case Blob URLs, per Bug 17765 [4].  This calls for a synchronous step attached to wherever URLs are used to "peg" Blob URL data at fetch, so that the chance of a concurrent revocation doesn't cause things to behave unpredictably.  Firefox does a variation of this with keeping channels open, but solving this bug interoperably is going to be very hard, and has to be done in different places across the platform.  And even within CSS.  This is hard to move forward with.
>>
>> Hard.
>
> It actually has turned out to be surprisingly easy in Gecko. But I
> realize the same might not be true everywhere.

Right, and defining just when it happens, across browsers, may also be hard.

>>> 2b.To adopt an 80-20 rule, and only specify what happens for some cases that seem common, but expressly disallow other cases.  This might be a more muted version of Bug 17765, especially if it can't be done within fetch [5].
>>
>> Ugly.
>>
>>> This could mean that the "blob" clause for "basic fetch"[5] only defines some cases where a synchronous fetch can be run (TBD) but expressly disallows others where synchronous fetching is not feasible.  This would limit the use of Blob URLs pretty drastically, but might be the only solution.  For instance, asynchronous calls accompanying <embed>, "defer" etc. might have to be expressly disallowed.  It would be great if we do this in fetch [5] :-)
>>
>> Just to be clear, this would limit the use of *autoRevoke* Blob URLs,
>> not all Blob URLs, yes?
>
> No, it would limit the use of all *revokable* Blob URLs. Since you get
> exactly the same issues when the page calls revokeObjectURL manually.
> So that means that it applies to all Blob URLs.

Ah, right; all revoked Blob URLs.

>>> 2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it autoRevoke).  But we jettisoned this for race conditions e.g.
>>>
>>> // This is in IE only
>>>
>>> img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true});
>>>
>>> // race now! then fail in IE only
>>> img1.src = img2.src;
>>>
>>> will fail in IE with oneTimeOnly.  It appears to fail reliably, but again, "dereference URL" may not be interoperable here.  This is probably not what we should do, but it was worth listing, since it carries the brute force of a shipping implementation, and shows how some % of the market has actively solved this problem :)
>>
>> I'm not really sure this is so bad.  I know it's the case I brought
>> up, and I must admit that I disliked the "oneTimeOnly" when I first
>> heard about it, but all other proposals [including not having
>> automatic revocation at all] now seem worse.  Here you've set
>> something to be oneTimeOnly and used it twice; if that fails in IE,
>> that's correct.  If it works some of the time in other browsers [after
>> they implement oneTimeOnly], that's not good, but you did pretty much
>> aim at your own foot.  Developers that actively try to do the right
>> thing will have consistent good results without extra code, at least.
>> I realize that img1.src = img2.src failing is odd, but as [IIRC]
>> Adrian pointed out, if it's an uncacheable image on a server that's
>> gone away, couldn't that already happen, depending on your network
>> stack implementation?
>
> I'm more worried that if implementations doesn't initiate the load
> synchronously, which is hard per your comment above, then it can
> easily be random which of the two loads succeeds and which fails. If
> the revoking happens at the end of the load, both loads could even
> succeed depending on timing and implementation details.

Yup; I'm just saying that if you get a failure here, you shouldn't be
surprised, no matter which img gets it.  You did something explicitly
wrong.  Ideally we'd give predictable behavior, but if we can't do
that, we should at least give good behavior to good coders.

Hmm...now Glenn points out another problem: if you /never/ load the
image, for whatever reason, you can still leak it.  How likely is that
in good code, though?  And is it worse than the current state in good
or bad code?
Received on Thursday, 2 May 2013 00:01:53 UTC