Re: File API "oneTimeOnly" is too poorly defined from Jonas Sicking on 2012-03-28 (public-webapps@w3.org from January to March 2012)

From: Jonas Sicking <jonas@sicking.cc>
Date: Wed, 28 Mar 2012 00:19:55 -0700
To: Glenn Maynard <glenn@zewt.org>
Cc: public-webapps WG <public-webapps@w3.org>
Message-ID: <CA+c2ei_BdeCu5-aQvBUbTFPo=DsjwPbhbjyG=Th71T+SapxTDw@mail.gmail.com>
On Tue, Mar 27, 2012 at 4:59 PM, Glenn Maynard <glenn@zewt.org> wrote:
> I didn't realize this was actually added to the spec:
>
>> The optional options dictionary argument contains a key, oneTimeOnly that
>> defaults to false. If set to true, then the first time the Blob URI is
>> dereferenced, user agents MUST automatically revoke that Blob URI without
>> needing a call to revokeObjectURL() on the Blob URI.
>
> What does "dereferenced" mean?  Where is it defined?  What happens if two
> XHR calls open() a blob URL one after the other (causing fetches to be
> queued for it in separate task queues, whose order of execution is
> undefined)?  What happens if two completely unrelated APIs queue tasks in
> different task queues (causing the same problem, but in a way that can't be
> worked around within any one spec)?
>
> This feature is dangerously weakly defined.  It should be removed from the
> spec until it can be defined properly (or at least marked "not ready for
> implementations"), or we may end up with interop failures that could be hard
> to fix later.
>
> Again, I'm pretty sure the sanest way to approach this feature is for any
> API supporting it to grab a reference to the underlying resource, and revoke
> the URL, as soon as the string enters that API (eg. xhr.open() is called, or
> img.src is assigned).  That ensures it's always deterministically--and
> synchronously--clear who will actually successfully receive the object,
> regardless of later complications like separate task queues across APIs.  It
> doesn't answer all questions (eg. the issues mentioned at
> http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/1265.html),
> and the actual "dereferencing" action would need to be specified for every
> supported API (this would need work to make it easy to do), but it's a lot
> closer than what's in there now.

I think we need to define that APIs like xhr.open(...) and the img.src
setter synchronously "dereference" the URL before returning.

This is needed even if we didn't have oneTimeOnly for at least two reasons:

1.
var blob = getBlob();
var url = URL.createObjectURL(blob);
img.src = url;
URL.revokeObjectURL(url);

2.
var fileEntry = getFileEntry();
fileEntry.file(function(file) {
  fileEntry.createWriter(function(fileWriter) {
    var url = URL.createObjectURL(file);
    var xhr = new XMLHttpRequest();
    xhr.open("GET", url);
    xhr.send();
    xhr.onload = ...;
    fileWriter.write(new Blob(["hello"]));
  });
});


In the first example the blob-url is disabled synchronously after the
img.src is set. Unless it's defined when img.src "dereferences" the
blob-url, then it's undefined if the first example works.

In the second example the file object itself is disabled when the
fileWriter.write function is called. The blob-url which represents is
logically also disabled at the same time. If it's not defined when the
XHR object "dereferences" the blob-url then it's undefined whether the
second example works.

In fact, this problem isn't even blob-url specific. If you change the
second example to not use blob-urls, but rather read from 'file' using
a FileReader, you'll have exactly the same question of if starting to
read the Blob happens before the Blob is disabled, or after.


Generally speaking, in order to be able to precisely define when these
URLs or Blobs are "dereferenced" we likely need to define that that
happens synchronously from the various APIs that dereferences URLs and
Blobs. It so happens that dereferencing synchronously also is the most
useful behavior for authors.

Note that no actual IO needs to happen just because you "dereference"
the URL. So no synchronous IO is required.

We took a survey of the various points in the Gecko codebase to see if
we dereference URLs and Blobs synchronously or not. The only API we
found that didn't do so was the IndexedDB code for storing Blobs.


All of this will definitely be a lot of work to specify (and possibly
implement). But I don't see any other options to get interoperability
with Blobs and blob-URLs. It's definitely not a problem restricted to
oneTimeOnly.

/ Jonas
Received on Wednesday, 28 March 2012 07:21:16 UTC