RE: DOM Storage feedback from Ian Hickson on 2008-12-24 (public-html@w3.org from December 2008)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 24 Dec 2008 12:16:46 +0000 (UTC)
To: Zhenbin Xu <Zhenbin.Xu@microsoft.com>
Cc: Ian Hickson <IMCEAMAILTO-ian+40hixie+2Ech@windows.microsoft.com>, "public-html-comments@w3.org" <public-html-comments@w3.org>, "public-html@w3.org" <public-html@w3.org>, Sunava Dutta <sunavad@windows.microsoft.com>, IE8 Core AJAX SWAT Team <ieajax@microsoft.com>
Message-ID: <Pine.LNX.4.62.0812241159530.24109@hixie.dreamhostps.com>
On Fri, 20 Jun 2008, Zhenbin Xu wrote:
> >
> > On Thu, 19 Jun 2008, Zhenbin Xu wrote:
> > > >
> > > > The idea is that the storage is asynchronous (and under the 
> > > > control of the UA), but that the API appears synchronous. That is, 
> > > > as soon as one script sets a storage item to a particular value, 
> > > > the setter will return with no latency, and all attempts from any 
> > > > frames to read that same storage item will give the new value 
> > > > back. It doesn't have to be stored to disk immediately, however. 
> > > > Since this is a simple name/value text-only API, it is trivially 
> > > > cached in memory.
> > >
> > > Interesting - this is precisely what we meant by async model - 
> > > storage is asynchronous but the API appears synchronous.  Either the 
> > > spec has changed or we may have read the spec incorrectly? 
> > > Previously the spec stated that the setItem operation has to 
> > > guarantee atomic operation, which we read it to mean guaranteed to 
> > > be on persist store (disk or cloud).  If this is not the case, then 
> > > we are in agreement here. Since we thought we differed from the 
> > > spec, we added the event to compensate for it.  We don't feel it is 
> > > important event but it may be useful.
> >
> > The spec does say things have to be atomic from the perspective of the 
> > scripts, but doesn't attempt to say anything about how that should be 
> > implemented. Would you like this made more explicit?
> 
> More explicit would be nice - especially the synchronous API but 
> asynchronous storage part.

Done.


> > > > Also, if the main use case is caching data, then it is reasonably 
> > > > easy to use the simple API to provide a transaction-like 
> > > > mechanism:
> > > >
> > > >    // run this first, in one script block
> > > >    var id = localStorage['last-id'] + 1;
> > > >    localStorage['last-id'] = id;
> > > >    localStorage['email-ready-' + id] = "0"; // "begin"
> > > >
> > > >    // these can run each in separate script blocks as desired
> > > >    localStorage['email-subject-' + id] = subject;
> > > >    localStorage['email-from-' + id] = from;
> > > >    localStorage['email-to-' + id] = to;
> > > >    localStorage['email-body-' + id] = body;
> > > >
> > > >    // run this last
> > > >    localStorage['email-ready-' + id] = "1"; // "commit"
> > > >
> > > > An author could easily wrap this up providing custom begin() and 
> > > > commit() functions if desired. I don't see what having them in the 
> > > > API gains us, really, other than extra complexity for UAs.
> > >
> > > Here is the concern:
> > >
> > >      localStorage['email-subject-' + id] = getSubject();
> > >      localStorage['email-from-' + id] = resolveEmailias(from);
> > >      localStorage['email-to-' + id] = resolveEmailalis(to);
> > >      localStorage['email-body-' + id] = getBody();
> > >
> > > If resolveEmailAlias throws an exception, we would have partially 
> > > persisted states. Yes you can work around it by assigning results to 
> > > temporary variables but it is simpler to provide begin() and 
> > > commit() to the page author, especially considering there can be 
> > > complex logic between states. E.g
> > >
> > >      localStorage.begin();
> > >      Component1FromDeveloper1();  // localStorage operation inside
> > >      Component2FromDeveloper2();  // localStorage operation inside
> > >      localStorage.commit();
> > >
> > > Having begin/commit would allow better distributed, componentized 
> > > development.
> >
> > But the problem is that the above would fail if the component from 
> > developer 2 uses .begin() or .commit(),
> 
> This issue is commonly solved by strict coding guideline. It is the 
> integrator's job to ensure state integrity.  Developers should not be 
> handing begin/commit etc. inside their component.

Well it's similarly the developer's job not to throw unexpected 
exceptions, or to catch exceptions if they are expected. Why should coding 
guidelines help with one problem but not the other?


> > or if an exception was raised and caught elsewhere (since the 
> > .commit() would be missed and now everything would be covered by the 
> > "transaction"), etc. Basically having just a single transaction state 
> > with two separate methods and nothing based on an object's lifetime or 
> > anything to guarantee consistency the API becomes unusable in any 
> > number of ways.
> 
> The UA should make sure transaction is rolled back in case of exception.

How do you distinguish the case of an exception that was caught at a level 
beyond the level dealing with the commits and an exception caught by the 
level dealing with the commits that plans to rollback or commit as part of 
a separate execution path?


> Alternatively, it would seem fitting to have a rollback() API here
> 
> try
> {
>         localStorage.begin();
>         ....
>         localStorage.commit();
> }
> catch
> {
>         localStorage.rollback();
> }

This seems to be equivalent to the code sample above that doesn't require 
any UA support. It is as fragile, and adds no more power.


> > Right now you can avoid all these problems by just doing:
> >
> >    var success = false;
> >    try {
> >      var id = localStorage['last-id'] + 1;
> >      localStorage['last-id'] = id;
> >      localStorage['email-ready-' + id] = "0"; // "begin"
> >
> >      localStorage['email-subject-' + id] = getSubject();
> >      localStorage['email-from-' + id] = resolveEmailAlias(from);
> >      localStorage['email-to-' + id] = resolveEmailAlias(to);
> >      localStorage['email-body-' + id] = getBody();
> >
> >      success = true;
> >    } catch (e) {
> >      // report or resolve e if necessary
> >    }
> 
> Not easy for script recover from exception. Either some garbage are left 
> in localStorage, or a more complex scheme is needed.

No complex scheme, I in fact included the code it would take in the 
example:

> >    if (success) {
> >      localStorage['email-ready-' + id] = "1"; // "commit"
> >    } else {
> >      localStorage.removeItem('email-subject-' + id);
> >      localStorage.removeItem('email-from-' + id);
> >      localStorage.removeItem('email-to-' + id);
> >      localStorage.removeItem('email-body-' + id);
> >      localStorage.removeItem('email-ready-' + id); // "cancel"
> >    }
> >
> > This can easily be wrapped into a utility method that takes a single 
> > method as its only argument:
> >
> >    function transactionAdd(body) {
> >      var success = false;
> >      var keys = [];
> >      try {
> >        var id = localStorage['last-id'] + 1;
> 
> If the id is persisting on disk and ever increasing, there is a danger 
> of overflow. So a more elaborated scheme may be needed here.

At one transaction _per second_, if we assume the range of numbers is just 
zero to a two billion, it would take 63 years to overflow. This doesn't 
seem like a real problem.


> >        localStorage['last-id'] = id;
> >        localStorage['block-' + id] = "0"; // "begin"
> >
> >        body(function (name, value) {
> >          localStorage['block-' + id + '-' + name] = value;
> >          keys.push(name);
> >        });
> >
> >        success = true;
> >      } catch (e) {
> >        // report or resolve e if necessary
> >      }
> 
> It is difficult to do any meaningful error handling inside an utility 
> function since it lacks context. Better leave it to the callers who have 
> more information about the situation.

You can use callbacks, or rethrow exceptions, to do all this. You can in 
fact exactly implement the three methods you are proposing using this 
mechanism, if you prefer that model.


> >      if (success) {
> >        localStorage['block-' + id] = "1"; // "commit"
> >      } else {
> >        for (var i in keys)
> >          localStorage.removeItem('block-' + id + '-' + keys[i]);
> >        localStorage.removeItem('block-' + id); // "cancel"
> >      }
> >      return success;
> >    }
> >
> > You would use this like this:
> >
> >    transactionAdd(function(add) {
> >      add('subject', getSubject());
> >      add('from', resolveEmailAlias(from));
> >      add('to', resolveEmailAlias(to));
> >      add('body', getBody());
> >    });
> >
> 
> How would application retrieve particular item later given that it has 
> no knowledge of what block id is?

Just return the id, or pass the id to the callback.


> If a wrapper function is needed all the time, why not make it as part of 
> the platform?

It's not needed all the time. In fact I'd guess that it's hardly ever 
needed at all.


In short I don't really see the need for an explicit .begin() or .commit() 
feature in the API.

(Also, I'd like to reiterate my earlier request that the non-standard 
event and methods that you implemented in IE8 either be removed or be 
clearly prefixed with "ms" or some such to indicate that they are 
proprietary extensions -- we don't want to confuse authors on the platform 
who might test in one browser and find things don't work in another 
browser. Thanks!)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 24 December 2008 12:17:23 UTC