Re: Colliding FileWriters

On Wed, Jan 11, 2012 at 1:41 PM, Eric U <ericu@google.com> wrote:
> On Wed, Jan 11, 2012 at 12:25 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>> On Tue, Jan 10, 2012 at 1:32 PM, Eric U <ericu@google.com> wrote:
>>> On Tue, Jan 10, 2012 at 1:08 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>>>> Hi All,
>>>>
>>>> We've been looking at implementing FileWriter and had a couple of questions.
>>>>
>>>> First of all, what happens if multiple pages create a FileWriter for
>>>> the same FileEntry at the same time? Will both be able to write to the
>>>> file at the same time and whoever writes lasts to a given byte wins?
>>>
>>> This isn't currently specified, and that's a hole we should fill.  By
>>> not having it in the spec, my assumption would be that last-wins would
>>> hold, but it would be good to clarify it if that's the behavior we
>>> want.  It's especially important given that there's nothing like
>>> fflush(), which would help users know what "last" meant.  Speaking of
>>> which, should we add a flushing mechanism?
>>>
>>>> This is different from how file systems normally work since as long as
>>>> file is open for writing that tends to prevent other processes from
>>>> opening the same file.
>>>
>>> You're perhaps thinking of windows, where by default files are opened
>>> in exclusive mode?  On other operating systems, and on windows when
>>> you specify FILE_SHARE_WRITE in dwShareMode in CreateFile, multiple
>>> writers can exist simultaneously.
>>
>> Ah. I didn't realize this was different on other OSs. It still seems
>> risky to not provide any means to get exclusive access. The only way I
>> can see websites dealing with this is to create their own locking
>> mechanism backed by using IndexedDB transactions as low-level atomic
>> primitive (local-storage doesn't work since you can implement
>> compare-and-swap in an atomic manner).
>>
>> Having a 'exclusive' flag for createFileWriter seems much easier and
>> removes the IndexedDB dependency. I'd probably even say that it should
>> default to true since on the web defaulting to safe rather than fast
>> generally results in fewer bugs.
>
> I don't think I'd generally be averse to this.  However, it would then
> require some sort of a revocation mechanism as well.  If you're done
> with your FileWriter, you want to be able to get rid of it without
> depending on GC, so that another context can create one.  And if you
> forget to revoke it, behavior in the second context presumably depends
> on GC, which is a bit ugly.

I definitely agree that we need an explicit revoking mechanism. We
have a similar situation in IndexedDB where as long as a IDBDatabase
object is alive for a given database, no one can upgrade the database
version. Here we do have an explicit .close() method, but if you
forget to call it you end up waiting for GC. It's possibly somewhat
less of a problem in IndexedDB though since upgrading database
versions should be pretty rare.

> I'm not quite sure how urgent this is yet, though.  I've been assuming
> that if you have transactional/synchronization semantics you want to
> maintain, you'll be using IDB anyway, or a server handshake, etc.  But
> of course it's easy to write a naive app that the user loads in two
> windows, with bad effect.

Yeah, it's the "user opens page in two windows" scenario that I'm
concerned about. As well as similar conditions if you for example have
a Worker thread which holds a connection to the server and
occasionally writes data to a file based on information from the
server, and code in a window which reads data from the file and acts
on it.

I don't think we can relegate synchronization semantics to IDB. I
think we should have synchronization semantics at least as the default
mode for all data that is shared between Workers and Windows which can
be running on different threads. One great example is localStorage
which we spent a lot of effort on trying to make synchronized using
the storage mutex. We failed there, but not due to a lack of desire,
but due to the way the API is structured.

>> Though if we add the 'exclusive' flag described above, then we'll need
>> to keep createFileWriter async anyway.
>
> Right--I think we should pick whatever subset of these suggestions
> seems the most useful, since they overlap a bit.

Agreed.

> One working subset would be:
>
> * Keep createFileWriter async.
> * Make it optionally exclusive [possibly by default].  If exclusive,
> its length member is trustworthy.  If not, it can go stale.
> * Add an append method [needed only for non-exclusive writes, but
> useful for logs, and a safe default].

This sounds great to me if we make it exclusive by default and remove
the .length member for non-exclusive writers. Or make it return
null/undefined.


However this brings up another problem, which is how to support
clients that want to mix read and write operations. Currently this is
supported, but as far as I can tell it's pretty awkward. Every time
you want to read you have to nest two asynchronous function calls.
First one to get a File reference, and then one to do the actual read
using a FileReader object. You can reuse the File reference, but only
if you are doing multiple reads in a row with no writing in between.

If we support exclusive access (weather the default or not) this stops
working. Once a FileWriter has exclusive access I assume that calling
getFile should not produce a new File object until the exclusive
access has been released.

I don't have any great solutions to this problem. One solution would
be to make it possible to get a File directly from a FileWriter. This
accessor could even be synchronous and represent the file at the time
of the last write. This would allow syntax like:

myFileEntry.createWriter(function(mywriter) {
  // write some data
  mywriter.write(someblob);
  // wait for "success"
  mywriter.onwrite = function() {
    // Read some data;
    reader = new FileReader;
    reader.readAsArrayBuffer(mywriter.file);
    // wait for "success"
    reader.onload = function() {
      // do something with read data
    }
  };
});

This is pretty hideous though, but as far as I can tell better than
what we have now. But it is very surprising to have a File accessor on
the FileWriter.

I think the main problem is that reading and writing is spread out
over two separate objects. I can't think of a way to make things look
good as long as that is the case. Maybe the solution is to add
readAsArrayBuffer/readAsText/readAsDataURL directly on FileWriter?

/ Jonas

Received on Sunday, 22 January 2012 05:58:38 UTC