- From: Eric U <ericu@google.com>
- Date: Mon, 19 Mar 2012 15:10:46 -0700
- To: Jonas Sicking <jonas@sicking.cc>
- Cc: Webapps WG <public-webapps@w3.org>
On Wed, Feb 29, 2012 at 8:44 AM, Eric U <ericu@google.com> wrote: > On Mon, Feb 27, 2012 at 4:40 PM, Jonas Sicking <jonas@sicking.cc> wrote: >> On Mon, Feb 27, 2012 at 11:36 PM, Eric U <ericu@google.com> wrote: >>>>> One working subset would be: >>>>> >>>>> * Keep createFileWriter async. >>>>> * Make it optionally exclusive [possibly by default]. If exclusive, >>>>> its length member is trustworthy. If not, it can go stale. >>>>> * Add an append method [needed only for non-exclusive writes, but >>>>> useful for logs, and a safe default]. >>>> >>>> This sounds great to me if we make it exclusive by default and remove >>>> the .length member for non-exclusive writers. Or make it return >>>> null/undefined. >>> >>> I like exclusive-by-default. Of course, that means that by default >>> you have to remember to call close() or depend on GC, but that's >>> probably OK. I'm less sure about .length being unusable on >>> non-exclusive writers, but it's growing on me. Since by default >>> writers would be exclusive, length would generally work just the same >>> as it does now. However, if it returns null/undefined in the >>> nonexclusive case, users might accidentally do math on it (if (length >>>> 0) => false), and get confused. Perhaps it should throw? >>> >>> Also, what's the behavior when there's already an exclusive lock, and >>> you call createFileWriter? Should it just not call you until the >>> lock's free? Do we need a trylock that fails fast, calling >>> errorCallback? I think the former's probably more useful than the >>> latter, and you can always use a timer to give up if it takes too >>> long, but there's no way to cancel a request, and you might get a call >>> far later, when you've forgotten that you requested it. >>> >>>> However this brings up another problem, which is how to support >>>> clients that want to mix read and write operations. Currently this is >>>> supported, but as far as I can tell it's pretty awkward. Every time >>>> you want to read you have to nest two asynchronous function calls. >>>> First one to get a File reference, and then one to do the actual read >>>> using a FileReader object. You can reuse the File reference, but only >>>> if you are doing multiple reads in a row with no writing in between. >>> >>> I thought about this for a while, and realized that I had no good >>> suggestion because I couldn't picture the use cases. Do you have some >>> handy that would help me think about it? >> >> Mixing reading and writing can be something as simple as increasing a >> counter somewhere in the file. First you need to read the counter >> value, then add one to it, then write the new value. But there's also >> more complex operations such as reordering a set of blocks to >> "defragment" the contents of a file. Yet another example would be >> modifying a .zip file to add a new file. When you do this you'll want >> to first read out the location of the current zip directory, then >> overwrite it with the new file and then the new directory. > > That helps, thanks. So we'll need to be able to do efficient > (read[-modify-write]*), and we'll need to hold the lock for the reads > as well as the writes. The lock should prevent any other writes > [exclusive or not], but need not prevent unlocked reads. > >> We sat down and did some thinking about these two issues. I.e. the >> locking and the read-write-mixed issue. The solution is good news and >> bad news. The good news is that we've come up with something that >> seems like it should work, the bad news is that it's a totally >> different design from the current FileReader and FileWriter designs. > > Hmm...it's interesting, but I don't think we necessarily have to scrap > FR and FW to use it. > > Here's a modified version that uses the existing interfaces: > > interface LockedReaderWriter : FileReader, FileWriter { > [all the FileReader and FileWriter members] > > readonly attribute File writeResult; > } This came up in an offline discussion recently regarding an currently-unserved use case: using a web app to edit a file outside the browser sandbox. You can certainly drag the file into or out of the browser, but it's nothing like the experience you get with a native app, where if you select a file for editing you can read+write it many times, at its true location, without additional permission checks. If we added something like a "refresh" to regain expired locks with this object, and some way for the user to grant permissions to a file for the session, it could take care of that use case. What do you think? > As with your proposal, as long as any read or write method has > outstanding events, the lock is held. The difference here is that > after any write method completes, and until another one begins or the > lock is dropped, writeResult holds the state of the File as of the > completion of the write. The rest of the time it's null. That way > you're always as up-to-date as you can easily be, but no more so [it > doesn't show partial writes during progress events]. To read, you use > the standard FileReader interface, slicing writeResult as needed to > get the appropriate offset. > > A potential feature of this design is that you could use it to read a > Blob that didn't come from writeResult, letting you pull in other data > while still holding the lock. I'm not sure if we need that, but it's > there if we want it. > >> To do the locking without requiring calls to .close() or relying on GC >> we use a similar setup to IndexedDB transactions. I.e. you get an >> object which represents a locked file. As long as you use that lock to >> read from and write to the file the lock keeps being held. However as >> soon as you return to the event loop from the last progress >> notification from the last read/write operation, the lock is >> automatically released. > > I love that your design is [I believe] deadlock-free, as the > write/read operations always make progress regardless of what other > locks you might be waiting to acquire. > >> This is exactly how IndexedDB transactions (and I believe WebSQL >> transactions) work. To even further reduce the risk of races, the >> IndexedDB spec forbids you from interacting with a transaction other >> than from the point when a transaction is created until we return to >> the event loop, as well as from progress event handlers from other >> read/write operations. We can do the same thing with file locks. That >> way it works out naturally that the lock is released when the last >> event finishes firing if there are no further pending read/write >> operations, since there would be no more opportunity to use the lock. >> In other words, you can't post a setTimeout and use the lock. This >> would be a bad idea anyway since you'd run the risk that the lock was >> released before the timeout fires. >> >> >> The resulting API looks something like this. I'm using the interface >> name FileHandle to distinguish from the current FileEntry API: >> >> interface FileHandle { >> LockedFile open([optional] DOMString mode); // defaults to "readonly" >> FileRequest getFile(); // .result is set to resulting File object >> }; >> >> interface LockedFile { >> readonly attribute FileHandle fileHandle; >> readonly attribute DOMString mode; >> >> attribute long long location; >> >> FileRequest readAsArrayBuffer(long size); >> FileRequest readAsText(long size, [optional] DOMString encoding); >> FileRequest write(DOMString or ArrayBuffer or Blob value); >> FileRequest append(DOMString or ArrayBuffer or Blob value); >> >> void abort(); // Immediately releases lock >> }; >> >> interface FileRequest : EventTarget >> { >> readonly attribute DOMString readyState; // "pending" or "done" >> >> readonly attribute any result; >> readonly attribute DOMError error; >> >> readonly attribute LockedFile lockedFile; >> >> attribute nsIDOMEventListener onsuccess; >> attribute nsIDOMEventListener onerror; >> >> attribute nsIDOMEventListener onprogress; >> } >> >> One downside of this is that it means that if you're doing a bunch of >> separate read/write operations in separate locks, each lock is held >> until we've had a chance to fire the final success event for the >> operation. So if you queue up a ton of small write operations you can >> end up mostly sitting waiting for the main thread to finish posting >> events. Ah, I see--you mean if you're waiting for the final event of a write on one lock to start the next read on another lock, you end up waiting a while. Not sure there's anything to do about that--you either want to wait for the write to finish, or you don't. If you don't, just go ahead and start the next read, given that it's in a different lock. >> To address this we can introduce the following methods on FileHandle >> >> FileRequest fastReadAsArrayBuffer(long size); >> FileRequest fastReadAsText(long size, [optional] DOMString encoding); >> FileRequest fastWrite(DOMString or ArrayBuffer or Blob value); >> FileRequest fastAppend(DOMString or ArrayBuffer or Blob value); >> >> These would behave exactly the same way as the methods on LockedFile, >> but would not use a lock and so would permit the next file operation >> immediately with no need to wait for the progress events to finish >> firing. >> >>> That assumes that write access always grants read access, which need >>> not necessarily be true. The FileSaver API was designed assuming the >>> opposite. Of course, nobody's implemented that yet, but that's >>> another thing we need to talk about, and another thread. >> >> We're planning on implementing FileSaver, but it would be as an API >> separate from FileHandle which means that this wouldn't be a problem. >> >> In general I think it's ok to grant "add but not read" access, but >> having "write but not read" runs the risk of code inadvertently >> destroying data due to not expecting read operations to fail. > > Append-but-not-read and overwrite-completely-but-not-read both make > sense. General write-but-not-read probably doesn't. I'm not sure > what situation you're describing w.r.t. data destruction. > > Regarding the queue-lock problem: If you're done with the lock, and > just want to do some unlocked reads, just create a standard FileReader > and fire it off. I'm a little confused about fastWrite, though--it > sounds like you're suggesting we have a write method that writes to a > locked file even if someone else holds the lock. Should we call it > writeBulletToFoot? > > Eric
Received on Monday, 19 March 2012 22:11:29 UTC