Re: Colliding FileWriters

On Wed, Feb 29, 2012 at 1:56 AM, Glenn Maynard <glenn@zewt.org> wrote:
> On Mon, Feb 27, 2012 at 6:40 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>> To do the locking without requiring calls to .close() or relying on GC
>> we use a similar setup to IndexedDB transactions. I.e. you get an
>> object which represents a locked file. As long as you use that lock to
>> read from and write to the file the lock keeps being held. However as
>> soon as you return to the event loop from the last progress
>> notification from the last read/write operation, the lock is
>> automatically released.
>
>
> This sounds a lot like "microtasks", described here:
> http://lists.w3.org/Archives/Public/public-webapps/2011JulSep/1622.html.  I
> don't know where it's described in IndexedDB, but it seems like this is a
> notion that keeps coming up again and again.  It seems like this should be
> introduced as a consistent concept in the event model.

Yeah, we should probably use "end of microtask" rather than "end of task".

> I was a little confused at the above explanation.  I think what you mean is
> that the lock is held so long as a FileRequest object is active (eg. has yet
> to dispatch a success or error event).  More concretely, at the end of each
> microtask (if you want to use that terminology), all LockedFiles without any
> active FileRequests are released.  That's sort of like the "release when the
> LockedFile is GC'd" approach, except it's deterministic and doesn't expose
> GC.
>
> (I think that's equivalent to what you said later, but I want to make sure
> I'm following correctly.)

Yes, that's a good way to describe it. Especially if we use microtasks.

>> One downside of this is that it means that if you're doing a bunch of
>> separate read/write operations in separate locks, each lock is held
>> until we've had a chance to fire the final success event for the
>> operation. So if you queue up a ton of small write operations you can
>> end up mostly sitting waiting for the main thread to finish posting
>> events.
>
> It'd only slow things down if you attach an expensive, long-running event
> handler to a load/loadend event, which is an inherently bad idea if you're
> doing lots of tiny operations.  Is that actually a problem?

No, that's not correct.

Most likely the implementation of this will use two threads. The main
thread which runs the JS code running in the window or worker and an
IO thread which does the file reading/writing. The main thread is also
where event handlers run. Every time a read/write is requested by the
main thread, data about this operation is sent to the IO thread
allowing the main thread to continue.

If the main thread creates two separate locks which perform two small
write operations the following steps will have to be taken:

1. The main thread creates lock1 and uses it to schedule a write
operation. Data about this is sent to the IO thread.
2. The IO thread starts processing the write operation from lock1.
3. The main thread creates lock2 and uses it to schedule a write
operation. Data about this is sent to the IO thread.
4. The IO thread finishes processing the write operation from lock1. A
"success" result is sent back to the main thread.
5. The IO thread doesn't know if lock1 will require more read/write
operations yet since the lock is still open, hence it needs to wait.
6. The "success" message for the write for lock1 reaches the main
thread and an event is fired.
7. The event finishes running and the main thread sees that no more
requests exist against lock1 and so dispatches a message to the IO
thread that lock1 can be closed
8. The IO thread receives the message to close lock1 and so can start
executing the write request for lock2.

As you can see, the IO thread needs to send a message to the main
thread and wait for another message to come back before it can process
lock2. This is a delay even if there's not even an event handler for
the write request for lock1.

We could add smarts and try to detect that that no event handler is
registered for the lock1 write. However we won't know that until the
main thread returns to the event loop after scheduling the write
request. And technically it wouldn't be correct to do so since the
main thread could add a success event handler at any point before the
event actually fires in step 6. And it wouldn't work in many cases
anyway since the page could attach a success handler which does other
things than schedule more read/write requests.

> By the way, readAsText and readAsArrayBuffer don't seem to fire load and
> loadend events at the end, like readAsDataURL does.  It looks like an
> oversight--they're fired in the error path.

Is this regarding the FileHandle proposal or regarding the File API
spec? The FileHandle proposal doesn't have readAsDataURL.

/ Jonas

Received on Wednesday, 29 February 2012 12:02:00 UTC