[whatwg] RFC: Alternatives to storage mutex for cookies and localStorage from Jeremy Orlow on 2009-09-04 (public-whatwg-archive@w3.org from September 2009)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Fri, 4 Sep 2009 16:32:19 +0900
Message-ID: <5dd9e5c50909040032i55f965b2jd437fe0e4128aace@mail.gmail.com>
On Fri, Sep 4, 2009 at 4:02 PM, Chris Jones <cjones at mozilla.com> wrote:

> I'd like to propose that HTML5 specify different schemes than a conceptual
> global storage mutex to provide consistency guarantees for localStorage and
> cookies.
>
> Cookies would be protected according to Benjamin Smedberg's post in the
> "[whatwg] Storage mutex and cookies can lead to browser deadlock" thread.
>  Roughly, this proposal would give scripts a consistent view of
> document.cookie until they completed.  AIUI this is stronger consistency
> than Google Chrome provides today, and anecdotal evidence suggests even
> their weaker consistency hasn't "broken the web."
>

To be fair, IE is in the same boat...which makes this argument even
stronger, I think.


> localStorage would be changed in a non-backwards-compatible way.  I believe
> that web apps can be partitioned into two classes: those that have planned
> for running concurrently (single-event-loop or not) in multiple "browsing
> contexts", and those that haven't.  I further posit that the second class
> would break when run concurrently in multiple contexts regardless of
> multiple event loops, and thus regardless of the storage mutex.  Even in the
> single-event-loop world, sites not prepared to be loaded in multiple tabs
> can stomp each other's data even though script execution is atomic.  (I
> wouldn't dare use my bank's website in two tabs at the same time in a
> single-event-loop browser.)  In other words, storage mutex can't help the
> second class of sites.
>
> (I also believe that there's a very large, third class of pages that work
> "accidentally" when run concurrently in multiple contexts, even though they
> don't plan for that.  This is likely because they don't keep
> quasi-persistent data on the client side.)
>
> Based on that, I believe localStorage should be designed with the first
> class of web apps (those that have considered data consistency across
> multiple concurrent contexts) in mind, rather than the second class.  Is a
> conceptual global storage mutex the best way for, say, gmail to guarantee
> consistency of its e-mail/contacts database?  I don't believe so: I think
> that a transactional localStorage is preferable. Transactional localStorage
> is easier for browser vendors to implement and should result in better
> performance for web apps in multi-process UAs.  It's more of a burden on web
> app authors than the hidden storage mutex, but I think the benefits outweigh
> the cost.
>
> I propose adding the functions
>
>  window.localStorage.beginTransaction()
>  window.localStorage.commitTransaction()
> or
>  window.beginTransaction()
>  window.commitTransaction()
>
> (The latter might be preferable if we later decide to add more resources
> with transactional semantics.)
>
> localStorage.getItem(),. setItem(), .removeItem(), and .clear() would
> remain specified as they are today.  beginTransaction() would do just that,
> open a transaction.  Calling localStorage.*() outside of an open transaction
> would cause a script exception to be thrown; this would unfortunately break
> all current clients of localStorage.  There might be cleverer ways to
> mitigate this breakage by a UA pretending not to support localStorage until
> a script called beginTransaction().
>
> yieldForStorageUpdates() would no longer be meaningful and should be
> removed.
>
> A transaction would successfully "commit", atomically applying its
> modifications to localStorage, if localStorage was not modified between
> beginTransaction() and commitTransaction().  Note that a transaction
> consisting entirely of getItem() could fail just as those actually modifying
> localStorage.  If a transaction failed, the UA would throw a
> TransactionFailed exception to script.  The UA would be allowed to throw
> this exception at any time between beginTransaction() and
> commitTransaction().
>
> There are numerous ways to implement transactional semantics.
> Single-event-loop UAs could implement beginTransaction() and
> commitTransaction() as no-ops.  Multi-event-loop UAs could reuse the global
> storage mutex if they had already implemented that (beginTransaction() ==
> lock, commitTransaction() == unlock).
>
> Some edge cases:
>
>  * calling commitTransaction() without beginTransaction() would throw an
> exception
>
>  * transactions would not be allowed to be nested, even on different
> localStorage DBs.  E.g. if site A's script begins a transaction on
> A.localStorage, and calls into site B's script embedded in an iframe which
> begins a transaction on B.localStorage, an exception would be thrown.
>
>  * transactions *could* be spread across script executions, alert()
> dialogs, sync XHR, or anywhere else the current HTML5 spec requires the
> storage mutex be released.  Note that UAs wishing to forbid that behavior
> could simply throw a TransactionFailed exception where the storage mutex
> would have been released in the current spec.  Or this could be made illegal
> by the spec.
>
>  * it's not clear to me how to handle async XHRs and Worker messages sent
> from within a failed transaction.  They could be specified to be sent or not
> and either behavior implemented easily.  My gut tells me that they *should*
> be sent regardless.
>
> Feedback very much desired.
>
> Cheers,
> Chris
>
> Addendum: I think that a past argument against a transactional approach was
> that scripts can cause side effects during transactions that can't be
> (easily, performantly) rolled back.  This is true, and troubling in that it
> deviates from SQL semantics, but because this proposal is designed for the
> first class of web apps I don't believe it's a compelling argument.
>  Further, a script can only corrupt its browsing-context-local state by
> mishandling failed transactions.  Using gmail as a convenient example, if a
> transaction failed but gmail wasn't prepared to handle the failure, that
> particular gmail instance would just break.  No e-mails or contacts would be
> corrupted, and the user could reload gmail and regain full functionality.
>  Servers should already be prepared to deal with clients behaving
> unpredictably.
>

Very interesting.  Some of the details I'm not sure about, but I think this
is much better than what already exists.  Enough better that I think it's
worth breaking backwards compatibility.

I mostly agree with your assertions about the type of developer who's using
localStorage.  It sure would be nice if we could give developers powerful
APIs and keep them simple and make it possible to implement them in a
performant manner.  Unfortunately, I think the current design cannot be
changed to meet "possible to implement in a performant manner" without
breaking backwards compatibility.

Part of me thinks that this API should match the WebDatabase API more.  For
example, you'd call a function with a callback.  That callback would be
given the localStorage object which it'd use to do manipulations.  Etc.  But
part of me like what you're suggesting here.  I actually think the idea of
throwing an exception whenever there's a serialization problem could be very
compelling, and could keep the door wide open for future performance
enhancements.  It's even possible that javascript engines could embed
elements of software transactional memory in the future to eliminate the
need to make such calls.  That seems really exciting.

It might also be possible to combine the 2 ideas: you call a function with
your callback and the callback is given a localStorage object which is only
valid within the callback, but an exception can be thrown when there's a
problem with the transaction.  Of course, the benefit to explicitly starting
and ending a transaction is that it can span setTimeouts, event handlers,
etc.  On the other hand, I wonder if the cases where an app would do this
and still be able to recover from a transaction failure would be limited.

Another thing we might want to consider is making transactions optional.
 This would satisfy group 1 and 2, but would put the group 3 you mentioned
at more risk.  In other words, not calling beginTransaction would not be
fatal.  It would just mean localStorage works as currently spec'ed.  But,
doing it within a transaction (be it a callback or within ___Transaction
calls) would give you additional guarantees.

Note that if we do decide to break backwards compatibility, there are some
other things we should consider...but I won't bring those up unless we do
decide to move in this direction.

Btw, I want to make it clear that I take the idea of
breaking compatibility VERY seriously.  I know LocalStorage is fairly well
adopted and that changing this would be pretty major.  But having a
cross-event-loop, synchronous API is really a terrible idea.  And changing
it now will be easier than changing it later.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090904/c8459f7e/attachment-0001.htm>
Received on Friday, 4 September 2009 00:32:19 UTC