[whatwg] RFC: Alternatives to storage mutex for cookies and localStorage from Jeremy Orlow on 2009-09-09 (public-whatwg-archive@w3.org from September 2009)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Wed, 9 Sep 2009 17:14:25 +0900
Message-ID: <5dd9e5c50909090114g1c686c3cx292c4bdb2d1eacc9@mail.gmail.com>
Great analysis.  I only have a few comments/questions:

On Wed, Sep 9, 2009 at 1:41 PM, Chris Jones <cjones at mozilla.com> wrote:

> Jeremy Orlow wrote:
>
>> On Wed, Sep 9, 2009 at 4:39 AM, Chris Jones <cjones at mozilla.com <mailto:
>> cjones at mozilla.com>> wrote:
>>
>>    Aaron Boodman wrote:
>>
>>        On Tue, Sep 8, 2009 at 11:23 AM, Chris Jones<cjones at mozilla.com
>>        <mailto:cjones at mozilla.com>> wrote:
>>
>>            In general, I agree with Rob about this proposal.  What
>>            problem with storage
>>            mutex as spec'd today does your proposal solve?
>>
>>
>>        The spec requires a single storage mutex for the entire UA.
>>        Therefore
>>        in a MELUA a web page can become unresponsive while waiting for
>> some
>>        other page to give up the lock. This is not good and something
>>        we have
>>        tried to avoid everywhere else in the spec.
>>
>>        Attempts to address this by doing per-origin locks wind up with
>>        deadlocks being possible.
>>
>>            Aaron Boodman wrote:
>>
>>                On Tue, Sep 8, 2009 at 1:41 AM, Robert
>>                O'Callahan<robert at ocallahan.org
>>                <mailto:robert at ocallahan.org>>
>>
>>                wrote:
>>
>>                    What is the intended semantics here? Chris' explicit
>>                    commitTransaction
>>                    would
>>                    throw an exception if the transaction was aborted
>>                    due to data
>>                    inconsistency,
>>                    leaving it up to the script to retry --- and making
>>                    it clear to script
>>                    authors that non-storage side effects during the
>>                    transaction are not
>>                    undone.
>>                    How would you handle transaction aborts?
>>
>>                Calls to transaction() are queued and executed serially
>>                per-origin
>>                with exclusive access. There is no such thing as a
>>                transaction abort
>>                because there cannot be consistency problems because of
>>                the serialized
>>                access.
>>
>>            No, transactions can still fail.  They can fail in ways
>>            immediately hidden
>>            from the script that requested them if the UA has to
>>            interrupt the
>>            conceptually executing transaction in the ways enumerated in
>>            a separate
>>            branch of this thread.  Later script executions can observe
>>            inconsistent
>>            state unless more is specified by your proposal.
>>
>>            Transactions can also fail visibly if write-to-disk fails
>>            (probably also in
>>            other ways I haven't considered).  It's not clear what
>>            should happen wrt to
>>            your proposal in this case.
>>
>>
>>        If so, I agree with roc's responses to them that they could
>> probably
>>        be handled without surfacing errors to the developer.
>>
>>        OTOH, I'm not really against adding the concept of fallibility
>> here.
>>
>>            In fact, I believe that the "Synchronous database API"
>>            describes the same
>>            transaction semantics as I proposed in the OP.  That spec
>>            adds implicit
>>            begin/commitTransaction and read-only transactions, but
>>            otherwise the
>>            semantics are the same.
>>
>>            So I'd like to amend my original proposal to be
>>
>>             Use Synchronous Web Database API transaction semantics.
>>             Except do not
>>            offer readTransaction: a transaction is implicitly a
>>            read-only transaction
>>            if only getItem() is called on localStorage from within
>>            localStorage.transaction().
>>
>>
>>        Agree. That is what I was trying to propose, too. I'm not sure
>> where
>>        we disagree :). Is it just that my proposal has no concept of
>>        errors?
>>        I'm not against adding them, mainly I was trying to keep my
>> proposal
>>        simple for purposes of discussion.
>>
>>
>>    Ay, there's the rub: I think the disagreement is between "mutex" vs.
>>    "transaction" semantics.  So far, I think perhaps "mutex" has been
>>    used as shorthand for "transaction."  But they aren't the same.
>>
>>    I think we all agree that a script may fail to modify localStorage
>>    in some situations (irrespective of global mutex vs. per-domain
>>    mutex). One camp, wanting "mutex" semantics, would prefer to pretend
>>    that the failures never happen and let scripts clean up the mess
>>    (partially-applied changes) if they do occur.  This is semantically
>>    broken, IMHO.
>>
>>    The second camp, wanting "transaction" semantics, explicitly
>>    acknowledge to web authors that localStorage is fallible, guarantee
>>    that modifications to localStorage are atomic, and notify scripts
>>    when modifications can't be made atomically.  This is the same
>>    approach taken by Web Database.  IMHO, this is much better
>>    semantically because (i) it gives web apps stronger guarantees; and
>>    (ii) it makes the discussion about global mutex/per-domain
>>    mutex/non-blocking an implementation issue rather semantic issue, as
>>    it should be.
>>
>>    Can those in the first camp explain why "mutex" semantics is better
>>    than "transaction" semantics?  And why it's desirable to have one DB
>>    spec specify "transaction" semantics (Web Database) and a second
>>    specify "mutex" semantics (localStorage)?
>>
>>
>> The way I understand it, there's 3 camps...and I think they've been
>> abusing both the word transaction and mutex.  We should probably all start
>> being more precise with our wording in this respect.  :-)
>>
>>
> I'd like to refine the above description of the design space.  I think
> there are three main design decisions: what ACID properties are guaranteed
> and at what granularity, sync and/or async API, and whether or not scripts
> can be notified when modifications to localStorage fail.
>
> In the current localStorage spec, the unit of atomicity/consistency is each
> modification (setItem()/removeItem()/clear()) of localStorage.  But the unit
> of isolation is all operations to localStorage between acquiring the storage
> mutex and releasing it.  And durability isn't specified AFAICT.  And AFAICT,
> scripts can observe some failed modifications to localStorage, but not all.
>
> In the current Web Database spec, the unit of A/C/I is each transaction,
> i.e., all executeSql() statements invoked on a Transaction object.
> Durability isn't defined, but it seems reasonable to assume that successful
> Transactions should be durable (best effort).  So a Transaction object is
> (best-effort) ACID.  Scripts *can* observe failed transactions and thus
> "rolled-back" changes.
>
> The first point on which the new proposals for localStorage in this thread
> differ is whether to guarantee ACID (best effort) at a *uniform* granularity
> or not.  All the proposals have some notion of "begin" and "end".  All of
> the proposals seem to want all operations between begin and end to be
> isolated (although some implementations in the wild do not guarantee this).
>  Some choose individual operations (get/set/remove/clear) of localStorage as
> the unit of atomicity/consistency.  This allows for some modifications
> between begin and end to be applied even if all changes couldn't be applied.
>  Others choose all modifications between begin and end as the unit of
> atomicity/consistency.  For this last group, "end" really means "commit",
> because begin/commit define a transaction in the sense of Web Database's
> Transaction objects.
>
> Semantically, an async vs. sync API doesn't change anything.  It does,
> however, affect the optimizations available to implementations.  An async
> callback might only be invoked by a SELUA when localStorage was loaded from
> disk into memory, so that the app could handle events in the mean time
> rather than blocking on disk.  In addition, a MELUA with a mutex
> implementation might only invoke the localStorage callback when the mutex
> could be acquired (e.g. only when a trylock() succeeded).  I'm beginning to
> be convinced that async callbacks are superior because of more flexible (and
> possibly performant) implementation options.
>
> Finally there's observable vs. unobservable "failures."  What "failure"
> means depends on the subset of ACID preserved, and at what granularity.
>  Some proposals do not allow scripts to observe failures.  For any proposal
> wishing to expand the unit of atomicity/consistency beyond single
> modifications (single set/remove/clear), I believe that the proposal must
> immediately terminate web apps if all changes between begin/end could not be
> applied.  Otherwise the UA has the non-option of either exposing non-atomic
> or inconsistent changes to localStorage, or allowing side-effecty script
> statements to complete in between attempted modifications to localStorage
> that fail.  Other proposals explicitly *allow* scripts to be notified of
> failures, with the intention that a script could retry failed modifications.
>  One use for such an API is a localStorage implementation with optimistic
> transactions, i.e. transactions implemented with STM-like techniques (which
> is what I had in mind with the OP).
>
> (For the latter, Rob O'Callahan proposed a very interesting "localStorage
> developer/debug mode" in which the UA would always fail a transaction at
> least once before succeeding.  This would allow authors to ensure that they
> uniformly handled failed transactions.  This could even be exposed as
> localStorage.__debug__ or somesuch rather than through UA-specific
> preferences.)
>
>
>> Those who want pessimistic transactions.  I.e. using locking so that you
>> never need to do a rollback (because it can never "fail").  This would be
>> compatible with either a sync or an async interface.
>>
>>
> By the above characterization: { uniform granularity of ACID (traditional
> transactions), async/sync unspecified, unobservable transaction failures }.
>
>  Those who optimistic transactions.  I.e. rollback may happen.  Either we
>> need to restrict what can be done during a localStorage transaction or we
>> need to have an exception that tells the script to undo itself.  This was
>> the original proposal, AFAICT.  It would work with both a sync or an async
>> interface.
>>
>>
> { Traditional transactions, sync/async unspecified, observable transaction
> failures }.
>
> I should note that I'm now of the opinion that { traditional transactions,
> async, observable transaction failures } is the way to go.
>
>  Those who want a queue.  I.e. those who want an asynchronous callback
>> based interface and the UA will only call one callback at a time.  Perhaps
>> on a per-origin basis.  Note that this can never "fail", need to be rolled
>> back, etc.
>>
>>
> This sounds to me like { traditional transactions, async, unobservable
> transaction failures } which is the same as your first camp above except
> async only.  Or are you proposing that the unit of atomicity/consistency is
> not all operations performed in the callback; i.e., that modifications done
> in the callback can be partially applied?


It's just an implementational difference.  A queue means that the event loop
can continue processing stuff while waiting for the 'lock' (which maybe is
better described as an 'update token' or something).  If you implement it as
a lock (which you would for a synchronous interface) then the event loop is
blocked.


> I believe Aaron is in the queue camp with me.  I'm becoming more and more
>> convinced that Chromium should/will not implement the storage mutex at all
>> (even for LocalStorage) unless we can come up with a way for event loops to
>> not be blocked.  And, as far as I can tell, Async interfaces are the only
>> way to accomplish this.
>>
>
> In general, agreed.  I still believe that a sync API


The problem with a sync interface, especially if it's one that can be held
after the top level script context, is deadlock issues with WebDatabase (and
possibly others).  What's there now doesn't have this issue because you'd
never have the lock when calling the database transaction callback.


> with exposed transaction failures


You'll only have transaction failures in an optimistic transaction model,
right?  So is that what you're suggesting?


> (as I proposed in the OP) and the right implementation could do quite well.
>  But I now think that an async version of that same API could perform even
> better.  In addition, that API is most flexible in terms of possible UA
> implementations.


> IOW, I think that { traditional transactions, async, observable failures }
> subsumes both { traditional transactions, sync, observable failures } (OP's
> proposal) *and* { traditional transactions, async, unobservable failures }
> (your and Aaron's proposal).
>
>
> IMHO there are two remaining questions: first, whether the "ideal"
> localStorage transactional API should allow observable transaction failures.
>  I believe that it should, as this allows for the widest variety of
> efficient implementations without changing ACID (best effort) guarantees
> given to authors or significantly complicating the localStorage API.
>

What failures could there be in a pesimistic/queue model?

Second, what is the best way to go forward with transactional localStorage
> while remaining backwards-compatible with current implementations.  One
> option would be to deprecate localStorage in favor of a future,
> transactional window.domainStorage or somesuch.
>

If we do this, we might as well just adopt something like the
WebSimpleDatabase proposal (which I still haven't gotten around to reading
yet) which seems much more powerful in many other ways.


> Another, probably better, option is Maciej's proposal, a two-headed
> localStorage.  The non-transactional localStorage would be deprecated and
> remain spec'd as today { single-modification AC/storage-mutex I/undefined D,
> sync, some observable failures }.


This is how I'd lean.


> In addition, for cases like "clear private data", UAs would be allowed to
> silently break storage-mutex isolation for apps using the non-transactional
> API.
>

I think it'd be better if they waited for the lock to be freed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090909/2ab3158e/attachment-0001.htm>
Received on Wednesday, 9 September 2009 01:14:25 UTC