- From: Jonas Sicking <jonas@sicking.cc>
- Date: Mon, 19 Mar 2012 16:17:50 -0700
- To: Eric U <ericu@google.com>
- Cc: Webapps WG <public-webapps@w3.org>
On Mon, Mar 19, 2012 at 3:10 PM, Eric U <ericu@google.com> wrote: > On Wed, Feb 29, 2012 at 8:44 AM, Eric U <ericu@google.com> wrote: >> On Mon, Feb 27, 2012 at 4:40 PM, Jonas Sicking <jonas@sicking.cc> wrote: >>> On Mon, Feb 27, 2012 at 11:36 PM, Eric U <ericu@google.com> wrote: >>>>>> One working subset would be: >>>>>> >>>>>> * Keep createFileWriter async. >>>>>> * Make it optionally exclusive [possibly by default]. If exclusive, >>>>>> its length member is trustworthy. If not, it can go stale. >>>>>> * Add an append method [needed only for non-exclusive writes, but >>>>>> useful for logs, and a safe default]. >>>>> >>>>> This sounds great to me if we make it exclusive by default and remove >>>>> the .length member for non-exclusive writers. Or make it return >>>>> null/undefined. >>>> >>>> I like exclusive-by-default. Of course, that means that by default >>>> you have to remember to call close() or depend on GC, but that's >>>> probably OK. I'm less sure about .length being unusable on >>>> non-exclusive writers, but it's growing on me. Since by default >>>> writers would be exclusive, length would generally work just the same >>>> as it does now. However, if it returns null/undefined in the >>>> nonexclusive case, users might accidentally do math on it (if (length >>>>> 0) => false), and get confused. Perhaps it should throw? >>>> >>>> Also, what's the behavior when there's already an exclusive lock, and >>>> you call createFileWriter? Should it just not call you until the >>>> lock's free? Do we need a trylock that fails fast, calling >>>> errorCallback? I think the former's probably more useful than the >>>> latter, and you can always use a timer to give up if it takes too >>>> long, but there's no way to cancel a request, and you might get a call >>>> far later, when you've forgotten that you requested it. >>>> >>>>> However this brings up another problem, which is how to support >>>>> clients that want to mix read and write operations. Currently this is >>>>> supported, but as far as I can tell it's pretty awkward. Every time >>>>> you want to read you have to nest two asynchronous function calls. >>>>> First one to get a File reference, and then one to do the actual read >>>>> using a FileReader object. You can reuse the File reference, but only >>>>> if you are doing multiple reads in a row with no writing in between. >>>> >>>> I thought about this for a while, and realized that I had no good >>>> suggestion because I couldn't picture the use cases. Do you have some >>>> handy that would help me think about it? >>> >>> Mixing reading and writing can be something as simple as increasing a >>> counter somewhere in the file. First you need to read the counter >>> value, then add one to it, then write the new value. But there's also >>> more complex operations such as reordering a set of blocks to >>> "defragment" the contents of a file. Yet another example would be >>> modifying a .zip file to add a new file. When you do this you'll want >>> to first read out the location of the current zip directory, then >>> overwrite it with the new file and then the new directory. >> >> That helps, thanks. So we'll need to be able to do efficient >> (read[-modify-write]*), and we'll need to hold the lock for the reads >> as well as the writes. The lock should prevent any other writes >> [exclusive or not], but need not prevent unlocked reads. >> >>> We sat down and did some thinking about these two issues. I.e. the >>> locking and the read-write-mixed issue. The solution is good news and >>> bad news. The good news is that we've come up with something that >>> seems like it should work, the bad news is that it's a totally >>> different design from the current FileReader and FileWriter designs. >> >> Hmm...it's interesting, but I don't think we necessarily have to scrap >> FR and FW to use it. >> >> Here's a modified version that uses the existing interfaces: >> >> interface LockedReaderWriter : FileReader, FileWriter { >> [all the FileReader and FileWriter members] >> >> readonly attribute File writeResult; >> } > > This came up in an offline discussion recently regarding an > currently-unserved use case: using a web app to edit a file outside > the browser sandbox. You can certainly drag the file into or out of > the browser, but it's nothing like the experience you get with a > native app, where if you select a file for editing you can read+write > it many times, at its true location, without additional permission > checks. If we added something like a "refresh" to regain expired > locks with this object, and some way for the user to grant permissions > to a file for the session, it could take care of that use case. > > What do you think? If we have an API which gives the web-page access to a FileEntry/FileHandle, then it seems like it can open locks any number of times to do proper read/write access. We've started drafting such an API at [1], however there's a lot remaining to figure out there, especially when it comes to security. And the API doesn't let a page bring up a file-picker where the user can grant read/write access to a single file. [1] https://wiki.mozilla.org/WebAPI/DeviceStorageAPI >> As with your proposal, as long as any read or write method has >> outstanding events, the lock is held. The difference here is that >> after any write method completes, and until another one begins or the >> lock is dropped, writeResult holds the state of the File as of the >> completion of the write. The rest of the time it's null. That way >> you're always as up-to-date as you can easily be, but no more so [it >> doesn't show partial writes during progress events]. To read, you use >> the standard FileReader interface, slicing writeResult as needed to >> get the appropriate offset. >> >> A potential feature of this design is that you could use it to read a >> Blob that didn't come from writeResult, letting you pull in other data >> while still holding the lock. I'm not sure if we need that, but it's >> there if we want it. >> >>> To do the locking without requiring calls to .close() or relying on GC >>> we use a similar setup to IndexedDB transactions. I.e. you get an >>> object which represents a locked file. As long as you use that lock to >>> read from and write to the file the lock keeps being held. However as >>> soon as you return to the event loop from the last progress >>> notification from the last read/write operation, the lock is >>> automatically released. >> >> I love that your design is [I believe] deadlock-free, as the >> write/read operations always make progress regardless of what other >> locks you might be waiting to acquire. >> >>> This is exactly how IndexedDB transactions (and I believe WebSQL >>> transactions) work. To even further reduce the risk of races, the >>> IndexedDB spec forbids you from interacting with a transaction other >>> than from the point when a transaction is created until we return to >>> the event loop, as well as from progress event handlers from other >>> read/write operations. We can do the same thing with file locks. That >>> way it works out naturally that the lock is released when the last >>> event finishes firing if there are no further pending read/write >>> operations, since there would be no more opportunity to use the lock. >>> In other words, you can't post a setTimeout and use the lock. This >>> would be a bad idea anyway since you'd run the risk that the lock was >>> released before the timeout fires. >>> >>> >>> The resulting API looks something like this. I'm using the interface >>> name FileHandle to distinguish from the current FileEntry API: >>> >>> interface FileHandle { >>> LockedFile open([optional] DOMString mode); // defaults to "readonly" >>> FileRequest getFile(); // .result is set to resulting File object >>> }; >>> >>> interface LockedFile { >>> readonly attribute FileHandle fileHandle; >>> readonly attribute DOMString mode; >>> >>> attribute long long location; >>> >>> FileRequest readAsArrayBuffer(long size); >>> FileRequest readAsText(long size, [optional] DOMString encoding); >>> FileRequest write(DOMString or ArrayBuffer or Blob value); >>> FileRequest append(DOMString or ArrayBuffer or Blob value); >>> >>> void abort(); // Immediately releases lock >>> }; >>> >>> interface FileRequest : EventTarget >>> { >>> readonly attribute DOMString readyState; // "pending" or "done" >>> >>> readonly attribute any result; >>> readonly attribute DOMError error; >>> >>> readonly attribute LockedFile lockedFile; >>> >>> attribute nsIDOMEventListener onsuccess; >>> attribute nsIDOMEventListener onerror; >>> >>> attribute nsIDOMEventListener onprogress; >>> } >>> >>> One downside of this is that it means that if you're doing a bunch of >>> separate read/write operations in separate locks, each lock is held >>> until we've had a chance to fire the final success event for the >>> operation. So if you queue up a ton of small write operations you can >>> end up mostly sitting waiting for the main thread to finish posting >>> events. > > Ah, I see--you mean if you're waiting for the final event of a write > on one lock to start the next read on another lock, you end up waiting > a while. Exactly. > Not sure there's anything to do about that--you either want > to wait for the write to finish, or you don't. If you don't, just go > ahead and start the next read, given that it's in a different lock. I think my proposal allows a page to indicate if it wants the implementation to wait for further reads/writes or not. / Jonas
Received on Monday, 19 March 2012 23:18:49 UTC