[whatwg] Structured clone algorithm on LocalStorage

On Thu, Sep 24, 2009 at 1:17 AM, Darin Fisher <darin at chromium.org> wrote:

> On Thu, Sep 24, 2009 at 12:20 AM, Jonas Sicking <jonas at sicking.cc> wrote:
>
>> On Wed, Sep 23, 2009 at 10:19 PM, Darin Fisher <darin at chromium.org>
>> wrote:
>> >
>> >
>> > On Wed, Sep 23, 2009 at 8:10 PM, Jonas Sicking <jonas at sicking.cc>
>> wrote:
>> >>
>> >> On Wed, Sep 23, 2009 at 3:29 PM, Jeremy Orlow <jorlow at chromium.org>
>> wrote:
>> >> > On Wed, Sep 23, 2009 at 3:15 PM, Jonas Sicking <jonas at sicking.cc>
>> wrote:
>> >> >>
>> >> >> On Wed, Sep 23, 2009 at 2:53 PM, Brett Cannon <brett at python.org>
>> wrote:
>> >> >> > On Wed, Sep 23, 2009 at 13:35, Jeremy Orlow <jorlow at chromium.org>
>> >> >> > wrote:
>> >> >> >> What are the use cases for wanting to store data beyond strings
>> (and
>> >> >> >> what
>> >> >> >> can be serialized into strings) in LocalStorage?  I can't think
>> of
>> >> >> >> any
>> >> >> >> that
>> >> >> >> outweigh the negatives:
>> >> >> >> 1)  From previous threads, I think it's fair to say that we can
>> all
>> >> >> >> agreed
>> >> >> >> that LocalStorage is a regrettable API (mainly due to its
>> >> >> >> synchronous
>> >> >> >> nature).  If so, it seems that making it more powerful and thus
>> more
>> >> >> >> attractive to developers is just asking for trouble.  After all,
>> the
>> >> >> >> more
>> >> >> >> people use it, the more lock contention there'll be, and the more
>> >> >> >> browser UI
>> >> >> >> jank users will be sure to experience.  This will also be worse
>> >> >> >> because
>> >> >> >> it'll be easier for developers to store large objects in
>> >> >> >> LoaclStorage.
>> >> >> >> 2)  As far as I can tell, there's no where else in the spec where
>> >> >> >> you
>> >> >> >> have
>> >> >> >> to serialize structured clone(able) data to disk.  Given that
>> >> >> >> LocalStorage
>> >> >> >> is supposed to throw an exception if any ImageData is contained
>> and
>> >> >> >> since
>> >> >> >> File and FileData objects are legal, it seems as though making
>> >> >> >> LocalStorage
>> >> >> >> handle structured clone data has a fairly high cost to
>> implementors.
>> >> >> >>  Not to
>> >> >> >> mention that disallowing ImageData in only this one case is not
>> >> >> >> intuitive.
>> >> >> >> I think allowing structured clone(able) data in LocalStorage is a
>> >> >> >> big
>> >> >> >> mistake.  Enough so that, if SessionStorage and LocalStorage
>> can't
>> >> >> >> diverge
>> >> >> >> on this issue, it'd be worth taking the power away from
>> >> >> >> SessionStorage.
>> >> >> >> J
>> >> >> >
>> >> >> > Speaking from experience, I have been using localStorage in my PhD
>> >> >> > thesis work w/o any real need for structured clones (I would have
>> >> >> > used
>> >> >> > Web Database but it isn't widely used yet and I was not sure if it
>> >> >> > was
>> >> >> > going to make the cut in the end). All it took to come close to
>> >> >> > simulating structured clones now was to develop my own
>> compatibility
>> >> >> > wrapper for localStorage (http://realstorage.googlecode.com for
>> those
>> >> >> > who care) and add setJSONObject() and getJSONObject() methods on
>> the
>> >> >> > wrapper. Works w/o issue.
>> >> >>
>> >> >> Actually, this seems like a prime reason *to* add structured storage
>> >> >> support. Obviously string data wasn't enough for you so you had to
>> >> >> write extra code in order to work around that. If structured clones
>> >> >> had been natively supported you both would have had to write less
>> >> >> code, and the resulting algorithms would have been faster. Faster
>> >> >> since the browser can serialize/parser to/from a binary internal
>> >> >> format faster than to/from JSON through the JSON serializer/parser.
>> >> >
>> >> > Yes, but since LocalStorage is already widely deployed, authors are
>> >> > stuck
>> >> > with the the structured clone-less version of LocalStorage for a very
>> >> > long
>> >> > time.  So the only way an app can store anything that can't be
>> JSONified
>> >> > is
>> >> > to break backwards compatibility.
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Sep 23, 2009 at 3:11 PM, Jonas Sicking <jonas at sicking.cc
>> > wrote:
>> >> >>
>> >> >> On Wed, Sep 23, 2009 at 1:35 PM, Jeremy Orlow <jorlow at chromium.org>
>> >> >> wrote:
>> >> >> > What are the use cases for wanting to store data beyond strings
>> (and
>> >> >> > what
>> >> >> > can be serialized into strings) in LocalStorage?  I can't think of
>> >> >> > any
>> >> >> > that
>> >> >> > outweigh the negatives:
>> >> >> > 1)  From previous threads, I think it's fair to say that we can
>> all
>> >> >> > agreed
>> >> >> > that LocalStorage is a regrettable API (mainly due to its
>> synchronous
>> >> >> > nature).  If so, it seems that making it more powerful and thus
>> more
>> >> >> > attractive to developers is just asking for trouble.  After all,
>> the
>> >> >> > more
>> >> >> > people use it, the more lock contention there'll be, and the more
>> >> >> > browser UI
>> >> >> > jank users will be sure to experience.  This will also be worse
>> >> >> > because
>> >> >> > it'll be easier for developers to store large objects in
>> >> >> > LoaclStorage.
>> >> >> > 2)  As far as I can tell, there's no where else in the spec where
>> you
>> >> >> > have
>> >> >> > to serialize structured clone(able) data to disk.  Given that
>> >> >> > LocalStorage
>> >> >> > is supposed to throw an exception if any ImageData is contained
>> and
>> >> >> > since
>> >> >> > File and FileData objects are legal, it seems as though making
>> >> >> > LocalStorage
>> >> >> > handle structured clone data has a fairly high cost to
>> implementors.
>> >> >> >  Not to
>> >> >> > mention that disallowing ImageData in only this one case is not
>> >> >> > intuitive.
>> >> >> > I think allowing structured clone(able) data in LocalStorage is a
>> big
>> >> >> > mistake.  Enough so that, if SessionStorage and LocalStorage can't
>> >> >> > diverge
>> >> >> > on this issue, it'd be worth taking the power away from
>> >> >> > SessionStorage.
>> >> >>
>> >> >> Despite localStorage unfortunate locking contention problem, it's
>> >> >> become quite a popular API. It's also very successful in terms of
>> >> >> browser deployment since it's available in at least latest versions
>> of
>> >> >> IE, Safari, Firefox, and Chrome. Don't know about support in Opera?
>> >> >
>> >> > The more popular it becomes, the more it's going to hurt UA
>> developers,
>> >> > web
>> >> > developers, and users.  I don't see why this is an argument for
>> making
>> >> > it
>> >> > more powerful.
>> >>
>> >> How will it hurt UA developers? I think we're stuck forever to
>> >> implement the locking mechanism. Adding more datatypes to the API
>> >> doesn't mean that we'll have to implement it more.
>> >
>> >
>> > multi-core is the future.  what's the opposite of fine-grained locking?
>> >  it's not good ;-)
>> > the implicit locking mechanism as spec'd is super lame.  implicitly
>> > unlocking under
>> > mysterious-to-the-developer circumstances!  how can that be a good
>> thing?
>> > storage.setItem("y",
>> > function_involving_implicit_unlocking(storage.getItem("x")));
>>
>> I totally agree on all points. The current API has big imperfections.
>> However I haven't seen any workable counter proposals so far, and I
>> honestly don't believe there are any as long as our goals are:
>>
>> * Don't break existing users of the current implementations.
>> * Don't expose race conditions to the web.
>> * Don't rely on authors getting explicit locking mechanisms right.
>>
>>
> The current API exposes race conditions to the web.  The implicit
> dropping of the storage lock is that.  In Chrome, we'll have to drop
> an existing lock whenever a new lock is acquired.  That can happen
> due to a variety of really odd cases (usually related to nested loops
> or nested JS execution), which will be difficult for developers to
> predict, especially if they are relying on third-party JS libraries.
>
> This issue seems to be discounted for reasons I do not understand.
>
>
>
>
>> But, as imperfect as the current API is, I think the following is a
>> decent way forward:
>>
>> * Allow pages that want the convenience of localStorage to use it. For
>> multi-process browsers this will mean poor UI *for pages that use
>> localStorage*. Especially when said pages hold on to localStorage for
>> a long time.
>> * Add alternative APIs that don't suffer from the same problems. More
>> below.
>>
>> >> > In addition, this argument assumes that Microsoft (and other UAs)
>> will
>> >> > implement the structured clone version of LocalStorage.  Has anyone
>> (or
>> >> > can
>> >> > anyone) from Microsoft comment on this?
>> >>
>> >> Given that I've never heard microsoft commit to a webstandard, ever, I
>> >> doubt that we'll hear anything here. Or that the lack of hearing
>> >> anything means we can draw any conclusions.
>> >>
>> >> > This is not a small feature to add.  Yes, it's smaller than creating
>> a
>> >> > new
>> >> > storage mechanism (that everyone is willing to adopt), but I still
>> think
>> >> > that's what we should be looking at.  Rather than polishing a turd.
>> >>
>> >> I do think that localStorage is a decent API that developers will want
>> >> to, and should, use. I think looking into adding a async accessor to
>> >> get a storage object so that people can use an localStorage-like API
>> >> while avoiding risks of blocking. This would also allow sharing data
>> >> between worker threads and the main window.
>> >
>> > i think the async callback to get a storage object is an improvement,
>> but
>> > i'm not sure that it addresses all of the problems.  for example, if a
>> > worker
>> > wants to read values from storage, compute, and then put a value into
>> > storage, it would probably do all of this from the storage callback.
>>  that
>> > would result in holding the lock for a long time, which would lock out
>> any
>> > other threads, including non-worker threads.
>> > the problem here is that localStorage is a pile of global variables.  we
>> are
>> > trying to give people global variables without giving them tools to
>> > synchronize
>> > access to them.  the claim i've heard is that developers are not savy
>> enough
>> > to use those tools properly.  i agree that developers tend to use tools
>> > without
>> > fully understanding them.  ok, but then why are we giving them global
>> > variables?
>> > there has to be a better answer.
>>
>> I actually described an potential solution in the thread on worker
>> storage.
>>
>> The problem you describe is a worker holding on the the storage for an
>> very long (indefinite) time, thereby locking out other threads/windows
>> from accessing the same storage area. This seems inevitable if we want
>> to prevent race conditions while at the same time not forcing the
>> complexities of locks onto web developers. The WebDatabase API suffers
>> from exactly the same problem.
>>
>
> Hmm... are you saying that from the SQLStatementCallback used to read
> some data out of the database, you might compute on that data, and then
> issue an executeSql call to write a computed result, and that in this
> scenario,
> the fact that it is the same transaction means that other threads are
> locked
> out of accessing the same database?  I hadn't considered chaining
> executeSql
> calls like this to keep the transaction alive.  Hmm...
>
>
>
>>
>> However, we can lessen the problem. By adding multiple storage areas,
>> we can allow a worker to use one storage area, while allowing other
>> parties to simultaneously use other storage areas. This way, if a
>> worker and a window aren't sharing data at all, they never get in the
>> way of each other.
>>
>> So a very simplistic design would be something like the following:
>>
>> getStorageArea(name, callback)
>>
>> when called will asynchronously call the callback parameter once the
>> storage area named by the first parameter becomes available. The
>> callback receives the storage area as an argument. We would also have
>> the function
>>
>> getMultipleStorageAreas(names, callback)
>>
>> Same as above, but names is an array of strings indicating multiple
>> storage areas that need to be acquired before the callback is called.
>> The callback receives all the areas in an array as an argument. This
>> function allows transferring data between multiple storage areas
>> without risking racing.
>>
>> There's several problems with this, such as the names are sort of
>> crappy, and that getting storage areas an array isn't very friendly.
>> However you get the basic idea.
>>
>> We don't even need to use Storage objects for this. In fact, I hope
>> mozilla will in a not too distant future come up with an alternative
>> proposal to the WebDatabase SQL API. Something like this might fit
>> into such a proposal as I think that'll have multiple separate storage
>> areas anyway.
>>
>> / Jonas
>>
>
>
> Maybe we should just invent a similar transaction method for name/value
> storage?  Wouldn't that be better than inventing a new idiom?  Ideally,
> we'd also make reads and writes on storage be asynchronous.  The
> transaction would then be usable to hold the lock across multiple
> asynchronous reads and writes.  Since local storage is backed by disk,
> it seems like a more ideal local storage API would not require synchronous
> filesystem access.
>
> -Darin
>


I forgot to say that I like the name proposal.  It gives domains a way to
carve up their storage, avoiding the need to create dummy hostnames.

-Darin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090924/9569aa1f/attachment-0001.htm>

Received on Thursday, 24 September 2009 01:26:34 UTC