[whatwg] Structured clone algorithm on LocalStorage

On Wed, Sep 23, 2009 at 10:19 PM, Darin Fisher <darin at chromium.org> wrote:
>
>
> On Wed, Sep 23, 2009 at 8:10 PM, Jonas Sicking <jonas at sicking.cc> wrote:
>>
>> On Wed, Sep 23, 2009 at 3:29 PM, Jeremy Orlow <jorlow at chromium.org> wrote:
>> > On Wed, Sep 23, 2009 at 3:15 PM, Jonas Sicking <jonas at sicking.cc> wrote:
>> >>
>> >> On Wed, Sep 23, 2009 at 2:53 PM, Brett Cannon <brett at python.org> wrote:
>> >> > On Wed, Sep 23, 2009 at 13:35, Jeremy Orlow <jorlow at chromium.org>
>> >> > wrote:
>> >> >> What are the use cases for wanting to store data beyond strings (and
>> >> >> what
>> >> >> can be serialized into strings) in LocalStorage? ?I can't think of
>> >> >> any
>> >> >> that
>> >> >> outweigh the negatives:
>> >> >> 1) ?From previous threads, I think it's fair to say that we can all
>> >> >> agreed
>> >> >> that LocalStorage is a regrettable API (mainly due to its
>> >> >> synchronous
>> >> >> nature). ?If so, it seems that making it more powerful and thus more
>> >> >> attractive to developers is just asking for trouble. ?After all, the
>> >> >> more
>> >> >> people use it, the more lock contention there'll be, and the more
>> >> >> browser UI
>> >> >> jank users will be sure to experience. ?This will also be worse
>> >> >> because
>> >> >> it'll be easier for developers to store large objects in
>> >> >> LoaclStorage.
>> >> >> 2) ?As far as I can tell, there's no where else in the spec where
>> >> >> you
>> >> >> have
>> >> >> to serialize structured clone(able) data to disk. ?Given that
>> >> >> LocalStorage
>> >> >> is supposed to throw an exception if any ImageData is contained and
>> >> >> since
>> >> >> File and FileData objects are legal, it seems as though making
>> >> >> LocalStorage
>> >> >> handle structured clone data has a fairly high cost to implementors.
>> >> >> ?Not to
>> >> >> mention that disallowing ImageData in only this one case is not
>> >> >> intuitive.
>> >> >> I think allowing structured clone(able) data in LocalStorage is a
>> >> >> big
>> >> >> mistake. ?Enough so that, if SessionStorage and LocalStorage can't
>> >> >> diverge
>> >> >> on this issue, it'd be worth taking the power away from
>> >> >> SessionStorage.
>> >> >> J
>> >> >
>> >> > Speaking from experience, I have been using localStorage in my PhD
>> >> > thesis work w/o any real need for structured clones (I would have
>> >> > used
>> >> > Web Database but it isn't widely used yet and I was not sure if it
>> >> > was
>> >> > going to make the cut in the end). All it took to come close to
>> >> > simulating structured clones now was to develop my own compatibility
>> >> > wrapper for localStorage (http://realstorage.googlecode.com for those
>> >> > who care) and add setJSONObject() and getJSONObject() methods on the
>> >> > wrapper. Works w/o issue.
>> >>
>> >> Actually, this seems like a prime reason *to* add structured storage
>> >> support. Obviously string data wasn't enough for you so you had to
>> >> write extra code in order to work around that. If structured clones
>> >> had been natively supported you both would have had to write less
>> >> code, and the resulting algorithms would have been faster. Faster
>> >> since the browser can serialize/parser to/from a binary internal
>> >> format faster than to/from JSON through the JSON serializer/parser.
>> >
>> > Yes, but since LocalStorage is already widely deployed, authors are
>> > stuck
>> > with the the structured clone-less version of LocalStorage for a very
>> > long
>> > time. ?So the only way an app can store anything that can't be JSONified
>> > is
>> > to break backwards?compatibility.
>> >
>> >
>> >
>> > On Wed, Sep 23, 2009 at 3:11 PM, Jonas Sicking?<jonas at sicking.cc>?wrote:
>> >>
>> >> On Wed, Sep 23, 2009 at 1:35 PM, Jeremy Orlow <jorlow at chromium.org>
>> >> wrote:
>> >> > What are the use cases for wanting to store data beyond strings (and
>> >> > what
>> >> > can be serialized into strings) in LocalStorage? ?I can't think of
>> >> > any
>> >> > that
>> >> > outweigh the negatives:
>> >> > 1) ?From previous threads, I think it's fair to say that we can all
>> >> > agreed
>> >> > that LocalStorage is a regrettable API (mainly due to its synchronous
>> >> > nature). ?If so, it seems that making it more powerful and thus more
>> >> > attractive to developers is just asking for trouble. ?After all, the
>> >> > more
>> >> > people use it, the more lock contention there'll be, and the more
>> >> > browser UI
>> >> > jank users will be sure to experience. ?This will also be worse
>> >> > because
>> >> > it'll be easier for developers to store large objects in
>> >> > LoaclStorage.
>> >> > 2) ?As far as I can tell, there's no where else in the spec where you
>> >> > have
>> >> > to serialize structured clone(able) data to disk. ?Given that
>> >> > LocalStorage
>> >> > is supposed to throw an exception if any ImageData is contained and
>> >> > since
>> >> > File and FileData objects are legal, it seems as though making
>> >> > LocalStorage
>> >> > handle structured clone data has a fairly high cost to implementors.
>> >> > ?Not to
>> >> > mention that disallowing ImageData in only this one case is not
>> >> > intuitive.
>> >> > I think allowing structured clone(able) data in LocalStorage is a big
>> >> > mistake. ?Enough so that, if SessionStorage and LocalStorage can't
>> >> > diverge
>> >> > on this issue, it'd be worth taking the power away from
>> >> > SessionStorage.
>> >>
>> >> Despite localStorage unfortunate locking contention problem, it's
>> >> become quite a popular API. It's also very successful in terms of
>> >> browser deployment since it's available in at least latest versions of
>> >> IE, Safari, Firefox, and Chrome. Don't know about support in Opera?
>> >
>> > The more popular it becomes, the more it's going to hurt UA developers,
>> > web
>> > developers, and users. ?I don't see why this is an argument for making
>> > it
>> > more powerful.
>>
>> How will it hurt UA developers? I think we're stuck forever to
>> implement the locking mechanism. Adding more datatypes to the API
>> doesn't mean that we'll have to implement it more.
>
>
> multi-core is the future. ?what's the opposite of fine-grained locking?
> ?it's not good ;-)
> the implicit locking mechanism as spec'd is super lame. ?implicitly
> unlocking under
> mysterious-to-the-developer circumstances! ?how can that be a good thing?
> storage.setItem("y",
> function_involving_implicit_unlocking(storage.getItem("x")));

I totally agree on all points. The current API has big imperfections.
However I haven't seen any workable counter proposals so far, and I
honestly don't believe there are any as long as our goals are:

* Don't break existing users of the current implementations.
* Don't expose race conditions to the web.
* Don't rely on authors getting explicit locking mechanisms right.

But, as imperfect as the current API is, I think the following is a
decent way forward:

* Allow pages that want the convenience of localStorage to use it. For
multi-process browsers this will mean poor UI *for pages that use
localStorage*. Especially when said pages hold on to localStorage for
a long time.
* Add alternative APIs that don't suffer from the same problems. More below.

>> > In addition, this argument assumes that Microsoft (and other UAs) will
>> > implement the structured clone version of LocalStorage. ?Has anyone (or
>> > can
>> > anyone) from Microsoft comment on this?
>>
>> Given that I've never heard microsoft commit to a webstandard, ever, I
>> doubt that we'll hear anything here. Or that the lack of hearing
>> anything means we can draw any conclusions.
>>
>> > This is not a small feature to add. ?Yes, it's smaller than creating a
>> > new
>> > storage mechanism (that everyone is willing to adopt), but I still think
>> > that's what we should be looking at. ?Rather than polishing a turd.
>>
>> I do think that localStorage is a decent API that developers will want
>> to, and should, use. I think looking into adding a async accessor to
>> get a storage object so that people can use an localStorage-like API
>> while avoiding risks of blocking. This would also allow sharing data
>> between worker threads and the main window.
>
> i think the async callback to get a storage object is an improvement, but
> i'm not sure that it addresses all of the problems. ?for example, if a
> worker
> wants to read values from storage, compute, and then put a value into
> storage, it would probably do all of this from the storage callback. ?that
> would result in holding the lock for a long time, which would lock out any
> other threads, including non-worker threads.
> the problem here is that localStorage is a pile of global variables. ?we are
> trying to give people global variables without giving them tools to
> synchronize
> access to them. ?the claim i've heard is that developers are not savy enough
> to use those tools properly. ?i agree that developers tend to use tools
> without
> fully understanding them. ?ok, but then why are we giving them global
> variables?
> there has to be a better answer.

I actually described an potential solution in the thread on worker storage.

The problem you describe is a worker holding on the the storage for an
very long (indefinite) time, thereby locking out other threads/windows
from accessing the same storage area. This seems inevitable if we want
to prevent race conditions while at the same time not forcing the
complexities of locks onto web developers. The WebDatabase API suffers
from exactly the same problem.

However, we can lessen the problem. By adding multiple storage areas,
we can allow a worker to use one storage area, while allowing other
parties to simultaneously use other storage areas. This way, if a
worker and a window aren't sharing data at all, they never get in the
way of each other.

So a very simplistic design would be something like the following:

getStorageArea(name, callback)

when called will asynchronously call the callback parameter once the
storage area named by the first parameter becomes available. The
callback receives the storage area as an argument. We would also have
the function

getMultipleStorageAreas(names, callback)

Same as above, but names is an array of strings indicating multiple
storage areas that need to be acquired before the callback is called.
The callback receives all the areas in an array as an argument. This
function allows transferring data between multiple storage areas
without risking racing.

There's several problems with this, such as the names are sort of
crappy, and that getting storage areas an array isn't very friendly.
However you get the basic idea.

We don't even need to use Storage objects for this. In fact, I hope
mozilla will in a not too distant future come up with an alternative
proposal to the WebDatabase SQL API. Something like this might fit
into such a proposal as I think that'll have multiple separate storage
areas anyway.

/ Jonas

Received on Thursday, 24 September 2009 00:20:11 UTC