[whatwg] RFC: Alternatives to storage mutex for cookies and localStorage

I'd like to propose that HTML5 specify different schemes than a 
conceptual global storage mutex to provide consistency guarantees for 
localStorage and cookies.

Cookies would be protected according to Benjamin Smedberg's post in the 
"[whatwg] Storage mutex and cookies can lead to browser deadlock" 
thread.  Roughly, this proposal would give scripts a consistent view of 
document.cookie until they completed.  AIUI this is stronger consistency 
than Google Chrome provides today, and anecdotal evidence suggests even 
their weaker consistency hasn't "broken the web."

localStorage would be changed in a non-backwards-compatible way.  I 
believe that web apps can be partitioned into two classes: those that 
have planned for running concurrently (single-event-loop or not) in 
multiple "browsing contexts", and those that haven't.  I further posit 
that the second class would break when run concurrently in multiple 
contexts regardless of multiple event loops, and thus regardless of the 
storage mutex.  Even in the single-event-loop world, sites not prepared 
to be loaded in multiple tabs can stomp each other's data even though 
script execution is atomic.  (I wouldn't dare use my bank's website in 
two tabs at the same time in a single-event-loop browser.)  In other 
words, storage mutex can't help the second class of sites.

(I also believe that there's a very large, third class of pages that 
work "accidentally" when run concurrently in multiple contexts, even 
though they don't plan for that.  This is likely because they don't keep 
quasi-persistent data on the client side.)

Based on that, I believe localStorage should be designed with the first 
class of web apps (those that have considered data consistency across 
multiple concurrent contexts) in mind, rather than the second class.  Is 
a conceptual global storage mutex the best way for, say, gmail to 
guarantee consistency of its e-mail/contacts database?  I don't believe 
so: I think that a transactional localStorage is preferable. 
Transactional localStorage is easier for browser vendors to implement 
and should result in better performance for web apps in multi-process 
UAs.  It's more of a burden on web app authors than the hidden storage 
mutex, but I think the benefits outweigh the cost.

I propose adding the functions

   window.localStorage.beginTransaction()
   window.localStorage.commitTransaction()
or
   window.beginTransaction()
   window.commitTransaction()

(The latter might be preferable if we later decide to add more resources 
with transactional semantics.)

localStorage.getItem(),. setItem(), .removeItem(), and .clear() would 
remain specified as they are today.  beginTransaction() would do just 
that, open a transaction.  Calling localStorage.*() outside of an open 
transaction would cause a script exception to be thrown; this would 
unfortunately break all current clients of localStorage.  There might be 
cleverer ways to mitigate this breakage by a UA pretending not to 
support localStorage until a script called beginTransaction().

yieldForStorageUpdates() would no longer be meaningful and should be 
removed.

A transaction would successfully "commit", atomically applying its 
modifications to localStorage, if localStorage was not modified between 
beginTransaction() and commitTransaction().  Note that a transaction 
consisting entirely of getItem() could fail just as those actually 
modifying localStorage.  If a transaction failed, the UA would throw a 
TransactionFailed exception to script.  The UA would be allowed to throw 
this exception at any time between beginTransaction() and 
commitTransaction().

There are numerous ways to implement transactional semantics. 
Single-event-loop UAs could implement beginTransaction() and 
commitTransaction() as no-ops.  Multi-event-loop UAs could reuse the 
global storage mutex if they had already implemented that 
(beginTransaction() == lock, commitTransaction() == unlock).

Some edge cases:

  * calling commitTransaction() without beginTransaction() would throw 
an exception

  * transactions would not be allowed to be nested, even on different 
localStorage DBs.  E.g. if site A's script begins a transaction on 
A.localStorage, and calls into site B's script embedded in an iframe 
which begins a transaction on B.localStorage, an exception would be thrown.

  * transactions *could* be spread across script executions, alert() 
dialogs, sync XHR, or anywhere else the current HTML5 spec requires the 
storage mutex be released.  Note that UAs wishing to forbid that 
behavior could simply throw a TransactionFailed exception where the 
storage mutex would have been released in the current spec.  Or this 
could be made illegal by the spec.

  * it's not clear to me how to handle async XHRs and Worker messages 
sent from within a failed transaction.  They could be specified to be 
sent or not and either behavior implemented easily.  My gut tells me 
that they *should* be sent regardless.

Feedback very much desired.

Cheers,
Chris

Addendum: I think that a past argument against a transactional approach 
was that scripts can cause side effects during transactions that can't 
be (easily, performantly) rolled back.  This is true, and troubling in 
that it deviates from SQL semantics, but because this proposal is 
designed for the first class of web apps I don't believe it's a 
compelling argument.  Further, a script can only corrupt its 
browsing-context-local state by mishandling failed transactions.  Using 
gmail as a convenient example, if a transaction failed but gmail wasn't 
prepared to handle the failure, that particular gmail instance would 
just break.  No e-mails or contacts would be corrupted, and the user 
could reload gmail and regain full functionality.  Servers should 
already be prepared to deal with clients behaving unpredictably.

Received on Friday, 4 September 2009 00:02:45 UTC