[whatwg] localStorage feedback from Ian Hickson on 2009-10-13 (public-whatwg-archive@w3.org from October 2009)

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 13 Oct 2009 02:07:33 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0910121154420.25383@hixie.dreamhostps.com>
On Thu, 17 Sep 2009, Jeremy Orlow wrote:
> On Thu, Sep 17, 2009 at 1:32 AM, Ian Hickson <ian at hixie.ch> wrote:
> >
> > I think we should be very careful before introducing a fourth storage 
> > mechanism to make sure that whatever we introduce really is something 
> > that's going to be very useful and really solve problems. I'd really 
> > rather not rush into adding yet another mechanism at this point.
> 
> Sure.  But what about the other idea Robert and Drew had (in the workers +
> local storage thread) about just having a WorkerLocalStorage mechanism?

That's a fourth storage mechanism, so my comments above apply.


On Wed, 23 Sep 2009, Brett Cannon wrote:
>
> Before the move to structured clones one could tell if a key was set by 
> calling getItem() and seeing if it returned null (had to use === as 
> someone could have called setItem() w/ null, but that would be coerced 
> to a string for storage). But with the latest draft's switch to 
> structured clones that test no longer clearly differentiates between 
> whether the value returned by getItem() signifies that the key was not 
> set, or the key was set with the value null.

I believe you can test if a key is in the storage area using:

   if (key in storage) { ... }

For example:

   if ('document' in window.localStorage) { ... }


> And since I just subscribed to the mailing list, I was wondering if the 
> whole workers/localStorage discussion ended or not, as I can provide a 
> (potentially minor) real-world use-case for sharing access between the 
> page and worker if people want to hear it (in a new email of course).

I think everyone agrees that we need a storage mechanism in workers; the 
question is what it should be. That's basically the same as the question 
of what should happen with the Web Database spec -- I don't think we would 
want to end up with multiple storage systems in workers. The answer to 
this question depends on the result of this debate in the Web Apps WG.


On Wed, 23 Sep 2009, Jeremy Orlow wrote:
>
> What are the use cases for wanting to store data beyond strings (and 
> what can be serialized into strings) in LocalStorage?  I can't think of 
> any that outweigh the negatives:
> 
> 1)  From previous threads, I think it's fair to say that we can all 
> agreed that LocalStorage is a regrettable API (mainly due to its 
> synchronous nature).  If so, it seems that making it more powerful and 
> thus more attractive to developers is just asking for trouble.  After 
> all, the more people use it, the more lock contention there'll be, and 
> the more browser UI jank users will be sure to experience.  This will 
> also be worse because it'll be easier for developers to store large 
> objects in LoaclStorage.
> 
> 2)  As far as I can tell, there's no where else in the spec where you 
> have to serialize structured clone(able) data to disk.  Given that 
> LocalStorage is supposed to throw an exception if any ImageData is 
> contained and since File and FileData objects are legal, it seems as 
> though making LocalStorage handle structured clone data has a fairly 
> high cost to implementors.  Not to mention that disallowing ImageData in 
> only this one case is not intuitive.
> 
> I think allowing structured clone(able) data in LocalStorage is a big 
> mistake.  Enough so that, if SessionStorage and LocalStorage can't 
> diverge on this issue, it'd be worth taking the power away from 
> SessionStorage.

The main use case is storing File objects when offline for later upload. I 
think that far outweighs the negatives you list above. We need this, and 
there's no other storage mechanism that everyone agrees is good enough.


> the problem here is that localStorage is a pile of global variables.  
> we are trying to give people global variables without giving them tools 
> to synchronize access to them.  the claim i've heard is that developers 
> are not savy enough to use those tools properly.  i agree that 
> developers tend to use tools without fully understanding them.  ok, but 
> then why are we giving them global variables?

The global variables have implicit locks such that you can build the tools 
for explicit locking on top of them:

   // run this first, in one script block
   var id = localStorage['last-id'] + 1;
   localStorage['last-id'] = id;
   localStorage['email-ready-' + id] = "0"; // "begin"

   // these can run each in separate script blocks as desired
   localStorage['email-subject-' + id] = subject;
   localStorage['email-from-' + id] = from;
   localStorage['email-to-' + id] = to;
   localStorage['email-body-' + id] = body;

   // run this last
   localStorage['email-ready-' + id] = "1"; // "commit"


On Thu, 24 Sep 2009, Darin Fisher wrote:
>
> The current API exposes race conditions to the web.  The implicit 
> dropping of the storage lock is that.  In Chrome, we'll have to drop an 
> existing lock whenever a new lock is acquired.  That can happen due to a 
> variety of really odd cases (usually related to nested loops or nested 
> JS execution), which will be difficult for developers to predict, 
> especially if they are relying on third-party JS libraries.
> 
> This issue seems to be discounted for reasons I do not understand.

You can only lose the lock in very specific conditions. Those conditions 
are rarely going to interact with code that actually does storage work in 
a way that relies on the lock:

 - changing document.domain
 - history.back(), .forward(), .go(n)
 - invoking a plugin
 - alert(), confirm(), prompt(), print()
 - showModalDialog()
 - yieldForStorageUpdates()

I discussed this in more detail here:

   http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-September/023059.html


On Tue, 8 Sep 2009, Chris Jones wrote:
> 
> Can those in the first camp explain why "mutex" semantics is better than 
> "transaction" semantics?  And why it's desirable to have one DB spec 
> specify "transaction" semantics (Web Database) and a second specify 
> "mutex" semantics (localStorage)?

I don't think it's desirable. It's just what we have, though an accident 
of history.


Where we're at: localStorage can't really change. It is what it is.

We have a better proposal, Web Database, but not everybody wants to 
implement it.

To move forward, I would recommend that someone come up with a storage 
proposal with the following characteristics:

 * All major browsers vendors are willing to implement it.
 * Compatible with workers.
 * Doesn't have any race conditions.
 * Doesn't involve a cross-process mutex that blocks interaction.
 * Stores structured data.
 * Can be queried in arbitrary ways.
 * Doesn't expose authors to locking primitives.

Then we can replace Web Database with it and we can move on.

I suggest that the right venue for this discussion would be the W3C Web 
Apps group, at public-webapps at w3.org.


On Wed, 9 Sep 2009, Darin Fisher wrote:
>
> What about navigating an iframe to a reference fragment, which could 
> trigger a scroll event?  The scrolling has to be done synchronously for 
> compat with the web.

You can only do that with same-domain pages, as far as I can tell.

(Does that really have to be synchronous? Right now we don't define the 
'scroll' event anywhere. What are the semantics it needs?)


On Mon, 31 Aug 2009, James Graham wrote:
> Quoting Ian Hickson <ian at hixie.ch>:
> > 
> > We can't treat cookies and persistent storage differently, because 
> > otherwise we'll expose users to cookie resurrection attacks. 
> > Maintaining the user's expectations of privacy is critical.
> 
> I think the paragraph under "treating persistent storage as cookies" 
> should simply be removed. The remainder of that section already does an 
> adequate job of explaining the privacy implications of persistent 
> storage. The UI should be entirely at the discretion of the browser 
> vendor since it involves a variety of tradeoffs, with the optimum 
> solution depending on the anticipated user base of the browser. Placing 
> spec requirements simply limits the abilities of browser vendors to find 
> innovative solutions to the problem. In addition, since there is no 
> interoperability requirement here, using RFC 2119 language seems 
> inappropriate; especially since the justification given is rather weak 
> ("this might encourage users?") and not supported by any evidence.

I think it's important to include this paragraph in a discussion of the 
privacy implications of these APIs. I feel like it would be irresponsible 
of me to not include this text, given how important this actually is.


> As to what browser vendors should actually _do_, it seems to me that the 
> "user's expectations of privacy" is actually an illusion in this case; 
> all the bad stuff that can be done with persistent storage can already 
> be done using a variety of techniques. Trying to fix up this one case 
> seems like closing the stable door after the horse has bolted. Therefore 
> the "delete local storage when you delete cookies" model seems flawed, 
> particularly as it can lead to the type of problem that Jens described 
> above.

Cookie resurrection has been a real concern on the Web. I don't think it's 
an illusion.


> On a slightly different topic, it is unclear what the relationship 
> between the statement in section 4.3 "User agents should expire data 
> from the local storage areas only for security reasons or when requested 
> to do so by the user" and the statement in section 6.1 "User agents may 
> automatically delete stored data after a period of time." is supposed to 
> be. Does the latter count as a security reason?

I've edited the latter text to indicate that the expiration should only be 
done at user option.


On Fri, 2 Oct 2009, Jeremy Orlow wrote:
> 
> Since my original post, I've continued thinking about LocalStorage, 
> structured clones, etc...and the more I've thought about it, the more 
> convinced I am that adding such support is a big mistake.  One way to 
> think about it is as follows:
> 
> 1)  We've all pretty much agreed that localStorage's synchronous design 
> was a mistake that we should be careful to not repeat.
>
> 2)  I think we can all agree that storing structured clone data makes 
> LocalStorage more powerful and useful to developers.
>
> 3)  And I think we can all agree that developers like to use more 
> powerful APIs.  Especially when the API is easy to use and understand 
> (as LocalStorage is).
>
> 4)  Lock contention becomes worse as the frequency of acquires and/or 
> the duration the lock is held increases.
> 
> Although there might be some subtleties about the statements I made that 
> people could argue with, I think all these statements are pretty 
> fundamentally true.  Assuming so, it's not a stretch to see that 2 and 3 
> imply that adding structured clones to local storage will lead to more 
> use of local storage.

I don't see why it would add significantly more use. Once a site is using 
localStorage, whether it has structured storage natively or not, they're 
going to store structured data in it -- e.g. using JSON, as some people 
have said they already are -- and so I don't think that this effectively 
increases the usage. It just makes it simpler for those who do use it.


> If use increases, then 4 implies that the storage lock is going to 
> become a bigger problem over time.  Since we can all agree that the 
> synchronous design of local storage is already a problem that we wish we 
> had avoided, I just can't understand why we're happy to make it a bigger 
> problem.
> 
> Does anyone have an argument against this?

I don't think it makes it a significantly bigger problem.


> Anyone who's going to use LocalStorage in the near to medium future will 
> need to handle the case of LocalStorage only handling strings.  This is 
> because structured clones supports a super-set of what can be serialized 
> within a script, there's no way for libraries to build a transparent 
> compatibility abstraction.  Thus, for some time, developers will either 
> need to only use data that can be serialized (thus making structured 
> clones only a performance optimization) or developers will need to cut 
> off browsers that don't support structured clones.
> 
> Assuming that, we're basically saying that structured clones is a 
> feature for the long term use and health of LocalStorage.  Now I know 
> that we can't just get rid of LocalStorage and coming up with viable 
> alternatives will take some time, but do we really believe that we can't 
> agree on and develop a better alternative in the mean time?

I think it makes sense to allow Files to be stored today.

However, I'm all for a better API. So if you can get people to agree to a 
better API before anyone ships this one and before pages start depending 
on it, then maybe we can remove the structured storage feature from 
localStorage.


> I'm fine with SessionStorage supporting structured clones.  I just don't 
> think we should make LocalStorage any more powerful.  In fact, at this 
> point, I think we should redirect all the time and effort we're putting 
> into making LocalStorage better (including solving lock contention 
> issues) and instead put it into creating a new API that solves these 
> problems and that all the browser vendors can get behind.  (If you have 
> ideas on how I can get this ball rolling, I'd love to hear them!)

I agree.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 12 October 2009 19:07:33 UTC