Re: File API "oneTimeOnly" is too poorly defined from Jonas Sicking on 2012-04-10 (public-webapps@w3.org from April to June 2012)

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 10 Apr 2012 08:58:31 -0700
To: Feras Moussa <ferasm@microsoft.com>
Cc: Charles Pritchard <chuck@jumis.com>, Glenn Maynard <glenn@zewt.org>, public-webapps WG <public-webapps@w3.org>
Message-ID: <CA+c2ei8HSUPoqe_1TE-Lj6A1bA6V3=MHu+yPPLTWVnxahBh_eQ@mail.gmail.com>

On Mon, Apr 9, 2012 at 3:52 PM, Feras Moussa <ferasm@microsoft.com> wrote:
> We agree that the spec text should be updated to more clearly define what dereference means.
> When we were trying to solve this problem, we looked for a simple and consistent way that a developer can understand what dereferencing is.
> What we came up with was the following definition: revoking should happen at the first time that any of the bits of the BLOB are accessed.
>
> This is a simple concept for a developer to understand, and is not complex to spec or implement. This also helps avoid having to explicitly spec
> out in the File API spec the various edge cases that different APIs exhibit – such as XHR open/send versus imgtag.src load versus css link href
> versus when a URL gets resolved or not. Instead those behaviors will continue to be documented in their respective spec.
>
> The definition above would imply that some cases, such as a cross-site-of-origin request to a Blob URL do not revoke, but we think that is OK
> since it implies a developer error. If we use the above definition for dereferencing, then in the XHR example you provided, xhr.send would
> be responsible for revoking the URL.

Depending on how you define "accessing the bits" I think this has the
risk of resulting in quite a few race conditions, both in a single
implementation, as well as across implementations.

For example, the following code:

url = URL.createObjectURL(blob, { oneTimeOnly: true; });
myImg.src = url;
setTimeout(function() { myOtherImg.src = url }, 0);

Assuming that the blob is backed by a OS file, this will start reading
the bits from the blob as soon as the IO thread is free to read from
the requested file. When that is depends on a lot of other things,
such as what happens in other tabs, what other actions did the page
just do, how much of a back-log does the IO thread currently have etc.

In gecko things are even worse since blobs can be backed by either OS
files, or by memory, or a combination thereof. We plan to use various
optimizations for determining which backing type to use. For example
if you get a blob from a WebSocket connection, it might depend on how
much data was downloaded before .binaryType was set to "blob" as well
as how big the websocket frame is.

So if what you have is a memory backed blob then "accessing the bits"
will likely happen sooner making it more likely that the above code
snippet will fail to load the second image.

I would expect other browsers to use other strategies for what backing
stores to use, introducing more uncertainty.

Like Glenn points out, basically all of the situations we are talking
about are error conditions. The way you should use the url after using
oneTimeOnly is to load from it exactly once. Anything else is "an
error" for some definition of "an error". But we all know that people
use web APIs in ways we wish they didn't. Intentionally or
accidentally.

What will happen for the following three code snippets in IE?

1.
url = URL.createObjectURL(blob, { oneTimeOnly: true; });
myImg.src = url;
setTimeout(function() { myOtherImg.src = url }, 0);

2.
url = URL.createObjectURL(blob);
myImg.src = url;
setTimeout(function() { URL.revokeObjectURL(url) }, 0);

3.
url = URL.createObjectURL(blob);
myImg.src = url;
URL.revokeObjectURL(url);

/ Jonas

Received on Tuesday, 10 April 2012 15:59:38 UTC