[whatwg] localStorage + worker processes from Michael Nordman on 2009-03-27 (public-whatwg-archive@w3.org from March 2009)

From: Michael Nordman <michaeln@google.com>
Date: Fri, 27 Mar 2009 15:32:05 -0700
Message-ID: <fa2eab050903271532y79579091h66e1882d292f3ceb@mail.gmail.com>
> sessionLifetime + tabSpecificScope doesn't make much sense since
> you get a new set of tabs when starting a new session

Sorry...  make that persistentLifetime  + tabScope doesn't make sense.


On Fri, Mar 27, 2009 at 3:29 PM, Michael Nordman <michaeln at google.com>wrote:

>
>
> On Tue, Mar 24, 2009 at 2:11 AM, Ian Hickson <ian at hixie.ch> wrote:
>
>>
>> I've updated the specs as follows:
>>
>>  - removed localStorage from Web Workers for now.
>>
>>  - extended the implicit lock mechanism that we had for storage to also
>>   cover document.cookie, and made the language more explicit about how
>>   it works.
>>
>>  - added navigator.releaseLock().
>>
>>
>> On Fri, 20 Mar 2009, Jeremy Orlow wrote:
>> >
>> > Anyhow, the very first example in the spec (
>> > http://dev.w3.org/html5/workers/#a-background-number-crunching-worker)
>> > shows work that's being done in a infinite loop with postMessage being
>> > called when each prime is found.  If you called localStorage anywhere
>> > within that loop (say, to periodically save all primes found), you would
>> > not be able to safely call window.localStorage in any other worker or
>> > the web page.  This is because the "task that started the script" never
>> > ends. And this its 'lock' (on other scripts using local storage) will
>> > never be released.
>>
>> I've removed localStorage from the Web Workers spec for now.
>>
>>
>> On Fri, 20 Mar 2009, Jonas Sicking wrote:
>> >
>> > I do think it would be great if workers had access to some type of
>> > structured storage. However I agree that the fact that both the main
>> > thread and workers have synchronous access to the same storage is not
>> > acceptable since that means that we're violating the
>> > shared-nothing-message-passing design that makes workers not have to
>> > deal with locks and other traditional multithread hazards.
>>
>> Agreed. The Database API seems well-suited for this, though.
>
>
> Again... its not just workers that are affected by this... speaking as
> someone
> that works on a multi-threaded browser, this is troubling. If its possible
> to
> spec features that allow script to poke at the world beyond the page
> boundaries in a fashion that doesn't not require locking semantics beyond
> the scope of a single scriptable API call... that would be less troubling.
>
>
>>
>>
>> On Fri, 20 Mar 2009, Drew Wilson wrote:
>> >
>> > One alternative I'd like to propose is to remove access to localStorage
>> > for dedicated workers, and give SharedWorkers access to localStorage,
>> > but have that storage be partitioned by the worker name (i.e. the worker
>> > can access it, but it's not shared with web pages or any other workers
>> > and so you don't have any synchronicity issues).
>>
>> That's an interesting idea, and would be relatively easy to do. Do people
>> think it is worth it?
>
>
> I think there's some additional low-hanging fruit too. We're toying with
> two,
> independent axis: lifetime vs accessScope.
>
>   'sessionStorage' has sessionOnlyLifetime and tabSpecificScope
>   'localStorage' has persistentLifetime and browserWideScope
>
> In this nomenclature, the new idea could be phrased as...
>
>   'page/workerStorage' has persistentLifetime and page/workerSpecificScope
>
> Other slots in the matrix formed by these two axis could make sense...
>
>   sessionLifetime + page/workerSpecificScope
>   sessionLifetime + browserWideScope
>
> sessionLifetime + tabSpecificScope doesn't make much sense since
> you get a new set of tabs when starting a new session
>
>
>>
>>
>>
>> On Fri, 20 Mar 2009, Aaron Boodman wrote:
>> >
>> > I think the best option is to make access to localstorage asynchronous
>> > for workers. This reduces the amount of time a worker can hold the
>> > localstore lock so that it shouldn't be a problem for normal pages. It
>> > sucks to make such a simple and useful API aync though.
>>
>> I don't think making it async helps here, since the problem isn't that it
>> is synchronous, but that workers don't return quickly (by design).
>>
>>
>> On Sat, 21 Mar 2009, Aaron Boodman wrote:
>> >
>> > Actually, I don't believe that it is required that the callback run
>> > asynchronously. All the callback is used for is establishing the lock
>> > lifetime explicitly, and we assume that this will usually make the lock
>> > lifetime short. So we can block while we wait for it to become
>> > available. This is just like the behavior today without workers.
>>
>> Nothing is to stop someone from just having a long callback, though.
>>
>>
>> On Sat, 21 Mar 2009, Jonas Sicking wrote:
>> >
>> > As I understand the current API (on main window) to be defined, as soon
>> > as someone accesses the .localStorage property, the implementation is
>> > supposed to acquire a lock. This lock would be held on to until that
>> > script returns to the event loop for that thread.
>> >
>> > So if javascript in another window, running in another thread or
>> > process, tries to access .localStorage for the same origin, the
>> > .localStorage getter would try to acquire the same lock and block until
>> > the first thread releases the lock.
>>
>> Right.
>>
>>
>> On Sat, 21 Mar 2009, Jonas Sicking wrote:
>> >
>> > The problem with synchronously grabbing the lock is that we can only
>> > ever have one feature that uses synchronous locks, otherwise we'll risk
>> > dead-locks.
>>
>> Indeed. This is a problem with the current API for localStorage in windows
>> as well.
>>
>> I've made the spec explicitly have a single shared lock for all features
>> that need locking (currently just .cookie and .localStorage).
>>
>>
>> On Sun, 22 Mar 2009, Michael Nordman wrote:
>> >
>> > Given an async api, would it be possible to store values into
>> > localStorage at onunload time? I expect that could be a useful time to
>> > use this API.
>> >
>> > function onunload() {
>> >   getLocalStorage(function(storage) {
>> >     // Will this ever execute?
>> >   });
>> > }
>> >
>> > Locking the storage until script completion isn't really necessary in
>> > many cases. Maybe we're over engineering this? Suppose immutability
>> > across calls was generally not guaranteed by the existing API. And we
>> > add an async getLocalStorage(callback) which does provide immutability
>> > for the duration of the callback if that is desired.
>>
>> The problem is that people will walk into race conditions without
>> realising it, and they are amongst the hardest problems to debug.
>>
>>
>> On Sun, 22 Mar 2009, Drew Wilson wrote:
>> >
>> > The problem is that .length is basically useless without some kind of
>> > immutability guarantees.
>>
>> Indeed.
>>
>>
>> On Sun, 22 Mar 2009, Drew Wilson wrote:
>> >
>> > That's why I'm proposing that the most reasonable implementation is just
>> > to have a simple lock like I describe above
>>
>> This is what I've done.
>>
>>
>> > and then either deny access to localStorage to dedicated workers (shared
>> > workers can silo the storage as I described previously), or else just
>> > enforce a limit to how long workers can hold the localStorage lock (if
>> > they hold it beyond some period, they get terminated just like page
>> > script that doesn't re-enter the event loop).
>>
>> I've removed the localStorage API from workers.
>>
>> Terminating the script like that would be really hard to debug also --
>> especially since it would end up terminating differently on different
>> computers (e.g. a desktop might execute the whole initialisation code in
>> the time alloted, while slower mobile devices might execute only the first
>> part and the worker would be in an unstable state).
>>
>>
>> On Mon, 23 Mar 2009, Jeremy Orlow wrote:
>> >
>> > One thing that hasn't been considered yet is some sort of optional hint
>> > to say "I'm done" in terms of accessing localStorage.  Maybe call it
>> > localStorage.checkpoint() or localStroage.commit()?
>>
>> Since this applies to more than just storage, I've put it on the Navigator
>> object. I've called it releaseLock().
>>
>>
>> On Sat, 21 Mar 2009, Jonas Sicking wrote:
>> >
>> > As a side note, if we do go with this async lock acquiring, we could add
>> > an API like:
>> >
>> > getLockedFeatures(callback, 'localStore', 'cookie');
>> >
>> > This would be an asynchronously grab locks to multiple features and only
>> > call the callback once all of them have been acquired. This would allow
>> > computations across data from multiple locations guaranteed to be in
>> > sync. The implementation would be responsible for grabbing the locks in
>> > a consistent order to prevent deadlocks.
>>
>> Why would we want more than one lock? Is the potential performance gain
>> worth the complexity?
>>
>> The problem with going with an async approach is that it means changing
>> the API, which is something we can't really do for cookie (and don't
>> really want to do for localStorage, since IE8 has shipped it.) We we are
>> going to need a synchronous locking mechanism anyway.
>>
>>
>> On Mon, 23 Mar 2009, Robert O'Callahan wrote:
>> >
>> > It has to be resolved in a way that doesn't expose asynchronous cookie
>> > or localStorage changes to Web developers. There is abundant evidence
>> > that race conditions and synchronization are too hard for developers to
>> > deal with. The spec should forbid asynchronously visible changes to
>> > cookies or localStorage. In fact, it should probably simply say that all
>> > script execution is serializable: always equivalent to some execution
>> > you could get with a single-threaded browser that runs all scripts to
>> > completion. Allowance could be made for explicit yield points if we need
>> > to, e.g. alert().
>>
>> Generally speaking I have tried to maintain this invariant, but I have
>> failed with cookies, and with localStorage in workers.
>>
>>
>> > Some sort of implicit locking with guaranteed deadlock freedom should be
>> > workable for parallel browser implementations. For example, partition
>> > browser contexts into "related" subsets, where context A is related to
>> > context B if a script running in context A can affect the execution of
>> > an already-running script in context B. Use one lock per subset, and
>> > have a script execution acquire the lock when it first touches
>> > localStorage or cookies, and drop the lock when it completes (or
>> > yields). Additional optimizations are possible.
>>
>> I've updated the spec to require the locking mechanism that was in place
>> for storage for cookies as well. This still means that one window can
>> block all other windows that try to use cookies, though, so I've also
>> added navigator.releaseLock() which can be called to explicitly release
>> the lock that is put in place.
>>
>> User agents that share event loops between origins can't actually have any
>> more than one lock total. Consider a case where there are three windows
>> from three different origins, A, B, and C, where C contains a couple of
>> <iframe>s, and where A, B, and C are independent, but C share an event
>> loop with whatever content is in its iframes. (This is the situation
>> Chrome and IE are in, as I understand it, with event loops being
>> per-window not per-origin, and it may be required because access to the
>> frames[] hierarchy is synchronous.) Now, assume A and B have both obtained
>> their respective locks, and are busy doing some long script. C is free to
>> run more tasks from its event loop, which could include navigating one
>> iframe to a page on either A and the other iframe to a page on B, meaning
>> that the event loop of C is now beholden to two locks. If there is any
>> manner in which to synchronously cause another origin to run script, this
>> now means that C can attempt to obtain both locks; if we now imagine
>> another window just like C that instead obtains the locks in the reverse
>> order, we get deadlock.
>>
>> If it can be shown that it is not ever possible for script in one origin
>> to synchronously invoke script in another origin, then I guess we could
>> have per-origin locks instead of a single lock.
>>
>>
>> On Sat, 21 Mar 2009, Jonas Sicking wrote:
>> >
>> > I don't think it will be a big problem. As long as we ensure that all
>> > locks are per-origin, that means that an application can only starve
>> > itself [using workers]. Something that it has clear incentives not to.
>>
>> It can starve itself and anyone that it is related to, which is a problem;
>> but it would also, I'm sure, lead to pretty awful bugs that authors
>> wouldn't understand how to fix. Are we sure we want to go there?
>>
>> --
>> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
>> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
>> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090327/5e124e45/attachment-0001.htm>
Received on Friday, 27 March 2009 15:32:05 UTC