[whatwg] localStorage + worker processes

On Tue, Mar 24, 2009 at 2:11 AM, Ian Hickson <ian at hixie.ch> wrote:

>
> I've updated the specs as follows:
>
>  - removed localStorage from Web Workers for now.
>
>  - extended the implicit lock mechanism that we had for storage to also
>   cover document.cookie, and made the language more explicit about how
>   it works.
>
>  - added navigator.releaseLock().
>
>
> On Fri, 20 Mar 2009, Jeremy Orlow wrote:
> >
> > Anyhow, the very first example in the spec (
> > http://dev.w3.org/html5/workers/#a-background-number-crunching-worker)
> > shows work that's being done in a infinite loop with postMessage being
> > called when each prime is found.  If you called localStorage anywhere
> > within that loop (say, to periodically save all primes found), you would
> > not be able to safely call window.localStorage in any other worker or
> > the web page.  This is because the "task that started the script" never
> > ends. And this its 'lock' (on other scripts using local storage) will
> > never be released.
>
> I've removed localStorage from the Web Workers spec for now.
>
>
> On Fri, 20 Mar 2009, Jonas Sicking wrote:
> >
> > I do think it would be great if workers had access to some type of
> > structured storage. However I agree that the fact that both the main
> > thread and workers have synchronous access to the same storage is not
> > acceptable since that means that we're violating the
> > shared-nothing-message-passing design that makes workers not have to
> > deal with locks and other traditional multithread hazards.
>
> Agreed. The Database API seems well-suited for this, though.


Again... its not just workers that are affected by this... speaking as
someone
that works on a multi-threaded browser, this is troubling. If its possible
to
spec features that allow script to poke at the world beyond the page
boundaries in a fashion that doesn't not require locking semantics beyond
the scope of a single scriptable API call... that would be less troubling.


>
>
> On Fri, 20 Mar 2009, Drew Wilson wrote:
> >
> > One alternative I'd like to propose is to remove access to localStorage
> > for dedicated workers, and give SharedWorkers access to localStorage,
> > but have that storage be partitioned by the worker name (i.e. the worker
> > can access it, but it's not shared with web pages or any other workers
> > and so you don't have any synchronicity issues).
>
> That's an interesting idea, and would be relatively easy to do. Do people
> think it is worth it?


I think there's some additional low-hanging fruit too. We're toying with
two,
independent axis: lifetime vs accessScope.

  'sessionStorage' has sessionOnlyLifetime and tabSpecificScope
  'localStorage' has persistentLifetime and browserWideScope

In this nomenclature, the new idea could be phrased as...

  'page/workerStorage' has persistentLifetime and page/workerSpecificScope

Other slots in the matrix formed by these two axis could make sense...

  sessionLifetime + page/workerSpecificScope
  sessionLifetime + browserWideScope

sessionLifetime + tabSpecificScope doesn't make much sense since
you get a new set of tabs when starting a new session


>
>
>
> On Fri, 20 Mar 2009, Aaron Boodman wrote:
> >
> > I think the best option is to make access to localstorage asynchronous
> > for workers. This reduces the amount of time a worker can hold the
> > localstore lock so that it shouldn't be a problem for normal pages. It
> > sucks to make such a simple and useful API aync though.
>
> I don't think making it async helps here, since the problem isn't that it
> is synchronous, but that workers don't return quickly (by design).
>
>
> On Sat, 21 Mar 2009, Aaron Boodman wrote:
> >
> > Actually, I don't believe that it is required that the callback run
> > asynchronously. All the callback is used for is establishing the lock
> > lifetime explicitly, and we assume that this will usually make the lock
> > lifetime short. So we can block while we wait for it to become
> > available. This is just like the behavior today without workers.
>
> Nothing is to stop someone from just having a long callback, though.
>
>
> On Sat, 21 Mar 2009, Jonas Sicking wrote:
> >
> > As I understand the current API (on main window) to be defined, as soon
> > as someone accesses the .localStorage property, the implementation is
> > supposed to acquire a lock. This lock would be held on to until that
> > script returns to the event loop for that thread.
> >
> > So if javascript in another window, running in another thread or
> > process, tries to access .localStorage for the same origin, the
> > .localStorage getter would try to acquire the same lock and block until
> > the first thread releases the lock.
>
> Right.
>
>
> On Sat, 21 Mar 2009, Jonas Sicking wrote:
> >
> > The problem with synchronously grabbing the lock is that we can only
> > ever have one feature that uses synchronous locks, otherwise we'll risk
> > dead-locks.
>
> Indeed. This is a problem with the current API for localStorage in windows
> as well.
>
> I've made the spec explicitly have a single shared lock for all features
> that need locking (currently just .cookie and .localStorage).
>
>
> On Sun, 22 Mar 2009, Michael Nordman wrote:
> >
> > Given an async api, would it be possible to store values into
> > localStorage at onunload time? I expect that could be a useful time to
> > use this API.
> >
> > function onunload() {
> >   getLocalStorage(function(storage) {
> >     // Will this ever execute?
> >   });
> > }
> >
> > Locking the storage until script completion isn't really necessary in
> > many cases. Maybe we're over engineering this? Suppose immutability
> > across calls was generally not guaranteed by the existing API. And we
> > add an async getLocalStorage(callback) which does provide immutability
> > for the duration of the callback if that is desired.
>
> The problem is that people will walk into race conditions without
> realising it, and they are amongst the hardest problems to debug.
>
>
> On Sun, 22 Mar 2009, Drew Wilson wrote:
> >
> > The problem is that .length is basically useless without some kind of
> > immutability guarantees.
>
> Indeed.
>
>
> On Sun, 22 Mar 2009, Drew Wilson wrote:
> >
> > That's why I'm proposing that the most reasonable implementation is just
> > to have a simple lock like I describe above
>
> This is what I've done.
>
>
> > and then either deny access to localStorage to dedicated workers (shared
> > workers can silo the storage as I described previously), or else just
> > enforce a limit to how long workers can hold the localStorage lock (if
> > they hold it beyond some period, they get terminated just like page
> > script that doesn't re-enter the event loop).
>
> I've removed the localStorage API from workers.
>
> Terminating the script like that would be really hard to debug also --
> especially since it would end up terminating differently on different
> computers (e.g. a desktop might execute the whole initialisation code in
> the time alloted, while slower mobile devices might execute only the first
> part and the worker would be in an unstable state).
>
>
> On Mon, 23 Mar 2009, Jeremy Orlow wrote:
> >
> > One thing that hasn't been considered yet is some sort of optional hint
> > to say "I'm done" in terms of accessing localStorage.  Maybe call it
> > localStorage.checkpoint() or localStroage.commit()?
>
> Since this applies to more than just storage, I've put it on the Navigator
> object. I've called it releaseLock().
>
>
> On Sat, 21 Mar 2009, Jonas Sicking wrote:
> >
> > As a side note, if we do go with this async lock acquiring, we could add
> > an API like:
> >
> > getLockedFeatures(callback, 'localStore', 'cookie');
> >
> > This would be an asynchronously grab locks to multiple features and only
> > call the callback once all of them have been acquired. This would allow
> > computations across data from multiple locations guaranteed to be in
> > sync. The implementation would be responsible for grabbing the locks in
> > a consistent order to prevent deadlocks.
>
> Why would we want more than one lock? Is the potential performance gain
> worth the complexity?
>
> The problem with going with an async approach is that it means changing
> the API, which is something we can't really do for cookie (and don't
> really want to do for localStorage, since IE8 has shipped it.) We we are
> going to need a synchronous locking mechanism anyway.
>
>
> On Mon, 23 Mar 2009, Robert O'Callahan wrote:
> >
> > It has to be resolved in a way that doesn't expose asynchronous cookie
> > or localStorage changes to Web developers. There is abundant evidence
> > that race conditions and synchronization are too hard for developers to
> > deal with. The spec should forbid asynchronously visible changes to
> > cookies or localStorage. In fact, it should probably simply say that all
> > script execution is serializable: always equivalent to some execution
> > you could get with a single-threaded browser that runs all scripts to
> > completion. Allowance could be made for explicit yield points if we need
> > to, e.g. alert().
>
> Generally speaking I have tried to maintain this invariant, but I have
> failed with cookies, and with localStorage in workers.
>
>
> > Some sort of implicit locking with guaranteed deadlock freedom should be
> > workable for parallel browser implementations. For example, partition
> > browser contexts into "related" subsets, where context A is related to
> > context B if a script running in context A can affect the execution of
> > an already-running script in context B. Use one lock per subset, and
> > have a script execution acquire the lock when it first touches
> > localStorage or cookies, and drop the lock when it completes (or
> > yields). Additional optimizations are possible.
>
> I've updated the spec to require the locking mechanism that was in place
> for storage for cookies as well. This still means that one window can
> block all other windows that try to use cookies, though, so I've also
> added navigator.releaseLock() which can be called to explicitly release
> the lock that is put in place.
>
> User agents that share event loops between origins can't actually have any
> more than one lock total. Consider a case where there are three windows
> from three different origins, A, B, and C, where C contains a couple of
> <iframe>s, and where A, B, and C are independent, but C share an event
> loop with whatever content is in its iframes. (This is the situation
> Chrome and IE are in, as I understand it, with event loops being
> per-window not per-origin, and it may be required because access to the
> frames[] hierarchy is synchronous.) Now, assume A and B have both obtained
> their respective locks, and are busy doing some long script. C is free to
> run more tasks from its event loop, which could include navigating one
> iframe to a page on either A and the other iframe to a page on B, meaning
> that the event loop of C is now beholden to two locks. If there is any
> manner in which to synchronously cause another origin to run script, this
> now means that C can attempt to obtain both locks; if we now imagine
> another window just like C that instead obtains the locks in the reverse
> order, we get deadlock.
>
> If it can be shown that it is not ever possible for script in one origin
> to synchronously invoke script in another origin, then I guess we could
> have per-origin locks instead of a single lock.
>
>
> On Sat, 21 Mar 2009, Jonas Sicking wrote:
> >
> > I don't think it will be a big problem. As long as we ensure that all
> > locks are per-origin, that means that an application can only starve
> > itself [using workers]. Something that it has clear incentives not to.
>
> It can starve itself and anyone that it is related to, which is a problem;
> but it would also, I'm sure, lead to pretty awful bugs that authors
> wouldn't understand how to fix. Are we sure we want to go there?
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090327/c8626e7b/attachment-0001.htm>

Received on Friday, 27 March 2009 15:29:33 UTC