Re: [IndexedDB] Current editor's draft from Jonas Sicking on 2010-07-07 (public-webapps@w3.org from July to September 2010)

From: Jonas Sicking <jonas@sicking.cc>
Date: Wed, 7 Jul 2010 00:27:02 -0700
To: Nikunj Mehta <nikunj@o-micron.com>
Cc: public-webapps <public-webapps@w3.org>
Message-ID: <AANLkTinOUVW6WsITHzC_wc5ujM60yUbq7F6TIl6Wuzwy@mail.gmail.com>
On Tue, Jul 6, 2010 at 6:31 PM, Nikunj Mehta <nikunj@o-micron.com> wrote:
> On Wed, Jul 7, 2010 at 5:57 AM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>> On Tue, Jul 6, 2010 at 9:36 AM, Nikunj Mehta <nikunj@o-micron.com> wrote:
>> > Hi folks,
>> >
>> > There are several unimplemented proposals on strengthening and
>> > expanding IndexedDB. The reason I have not implemented them yet is
>> > because I am not convinced they are necessary in toto. Here's my
>> > attempt at explaining why. I apologize in advance for not responding
>> > to individual proposals due to personal time constraints. I will
>> > however respond in detail on individual bug reports, e.g., as I did
>> > with 9975.
>> >
>> > I used the current editor's draft asynchronous API to understand where
>> > some of the remaining programming difficulties remain. Based on this
>> > attempt, I find several areas to strengthen, the most prominent of
>> > which is how we use transactions. Another is to add the concept of a
>> > catalog as a special kind of object store.
>>
>> Hi Nikunj,
>>
>> Thanks for replying! I'm very interested in getting this stuff sorted
>> out pretty quickly as almost all other proposals in one way or another
>> are affected by how this stuff develops.
>>
>> > Here are the main areas I propose to address in the editor's spec:
>> >
>> > 1. It is time to separate the dynamic and static scope transaction
>> > creation so that they are asynchronous and synchronous respectively.
>>
>> I don't really understand what this means. What are dynamic and static
>> scope transaction creation? Can you elaborate?
>
> This is the difference in the API in my email between openTransaction and
> transaction. Dynamic and static scope have been defined in the spec for a
> long time.

Ah, I think I'm following you now. I'm actually not sure that we
should have dynamic scope at all in the spec, I know Jeremy has
expressed similar concerns. However if we are going to have dynamic
scope, I agree it is a good idea to have separate APIs for starting
dynamic-scope transactions from static-scope transactions.

>> > 2. Provide a catalog object that can be used to atomically add/remove
>> > object stores and indexes as well as modify version.
>>
>> It seems to me that a catalog object doesn't really provide any
>> functionality over the proposal in bug 10052? The advantage that I see
>> with the syntax proposal in bug 10052 is that it is simpler.
>>
>> http://www.w3.org/Bugs/Public/show_bug.cgi?id=10052
>>
>> Can you elaborate on what the advantages are of catalog objects?
>
> To begin with, 10052 shuts down the "users" of the database completely when
> only one is changing its structure, i.e., adding or removing an object
> store.

This is not the case. Check the steps defined for setVersion in [1].
At no point are databases shut down automatically. Only once all
existing database connections are manually closed, either by calls to
IDBDatabase.close() or by the user leaving the page, is the 'success'
event from setVersion fired.

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=10052#c0

> How can we make it less draconian?

The 'versionchange' event allows pages that are currently using the
database to handle the change. The page can inspect the new version
number supplied by the 'versionchange' event, and if it knows that it
is compatible with a given upgrade, all it needs to do is to call
db.close() and then immediately reopen the database using
indexedDB.open(). The open call won't complete until the upgrade is
finished.

> Secondly, I don't see how that
> approach can produce atomic changes to the database.

When the transaction created in step 4 of setVersion defined in [1] is
created, only one IDBDatabase object to the database is open. As long
as that transaction is running, no requests returned from
IDBFactory.open will receive a 'success' event. Only once the
transaction is committed, or aborted, will those requests succeed.
This guarantees that no other IDBDatabase object can see a partial
update.

Further, only once the transaction created by setVersion is committed,
are the requested objectStores and indexes created/removed. This
guarantees that the database is never left with a partial update.

That means that the changes are atomic, right?

> Thirdly, we shouldn't
> need to change version in order to perform database changes.

First off, note that if the upgrade is compatible, you can just pass
the existing database version to setVersion. So no version *change* is
actually needed.

Second, I don't think there is much difference between

var txn = db.transaction();
db.openCatalog(txn).onsuccess = ...

vs

db.setVersion("5").onsuccess = ...

I don't see that telling people that they have to use the former is a big win.


The problem that I see with the catalog proposal, if I understand it
correctly, is that it means that a page that has a IDBDatabase object
open has to always be prepared for calls to
openObjectStore/openTransaction failing. I.e. the page can't ever know
that another page was opened which at any point created a catalog and
removed an objectStore. This forces pages to at every single call
either check that the version is still the same, or that each and
every call to openObjectStore/openTransaction succeeds. This seems
very error prone to me.

Looking at your example, it also seems like it contains a race
condition. There is a risk that when someone opens a database, the
first transaction, which uses a catalog to create the necessary
objectStores and indexes, is committed, but the second transaction,
which populates the objectStores with data, has not yet started.

> Finally, I am
> not sure why you consider the syntax proposal simpler. Note that I am not
> averse to the version change event notification.

Compare to how your code would look like with the proposals in bugs
9975 and 10052:

var db;
var dbRequest = indexedDB.open("parts", 'Part database');
dbRequest.onsuccess = function(event) {
  db = event.result;
  if (db.version != "1") {
    versionRequest = db.setVersion("1");
    versionRequest.ontimeout = function(event) {
      throw new Error("timeout while waiting for other DBs to close");
    };
    versionRequest.onsuccess = function(event) {
      event.transaction.onerror = function(event) {
        throw new Error("Failed to set up database due to error: " +
event.message);
      };
      db.setVersion("1");
      part = db.createObjectStore("part", "number", false);
      supplier = db.createObjectStore("supplier", "number", false);
      shipment = db.createObjectStore("shipment", id);
      part.createIndex("partName", "name");
      part.createIndex("partColor", "color");
      part.createIndex("partCity", "city");
      supplier.createIndex("supplierStatus", "status");
      supplier.createIndex("supplierCity", "city");
      shipment.createIndex("partSupplier", "[part, supplier]");
      var data = [{
        store: part, values: parts}, {
        store: supplier, values: suppliers}, {
        store: shipment, values: shipments}];
      for (var storeIndex = 0; storeIndex < data.length; ++storeIndex) {
        var task = data[storeIndex];
        for (var valueIndex = 0; valueIndex < task.values.length;
++valueIndex) {
          task.store.add(task.values[index]); // Ignoring the
possibility that there may be errors while adding
        }
      }
    };
  }
};

dbRequest.onerror = function(event) {
if (event.code === IDBDatabaseException.UNKNOWN_ERR)
 throw new Error("Could not open the database");
}

There are a few advantages. First of all all the operations can easily
be done in a single transaction, including populating the
objectStores. Second, no need for a separate catalog object. Third,
the index creator functions can live directly on the objectStore
interface. Fourth, there is less nesting of asynchronous callbacks.

>> > ------- IDL --------
>> >
>> > [NoInterfaceObject]
>> > interface IDBDatabase {
>> >  readonly attribute DOMString     name;
>> >  readonly attribute DOMString     description;
>> >  readonly attribute DOMStringList objectStores;
>> >  /*
>> >  Open an object store in the specified transaction. The transaction can
>> > be
>> >  dynamic scoped, or the requested object store must be in the static
>> > scope.
>> >  Returns IDBRequest whose success event of IDBTransactionEvent type
>> > contains
>> >  a result with IDBObjectStore and transaction is an
>> > IDBTransactionRequest.
>> >  */
>> >  IDBRequest openObjectStore(in DOMString name, in IDBTransaction txn,
>> >  in optional unsigned short mode /* defaults to READ_WRITE */);
>>
>> I don't understand the advantage of this proposal over mozillas
>> proposal.
>
> The above proposal allows opening multiple object stores in the same
> transaction in dynamic scope, even without having explicitly identified each
> one of them at the time of creating the transaction.

We don't mention dynamic transactions much in our proposal, mostly
because we're not convinced that it's something that we want to
implement. However if we really want to keep them, I suggest that we
create a separate transaction interface, like:

IDBDynamicTransaction {
interface IDBTransaction {
  readonly attribute IDBDatabase db;

  // Asynchronousy attempts to lock more objectStores.
  IDBRequest openObjectStore(in DOMString name, mode);

  void abort();
  attribute Function oncomplete;
  attribute Function onabort;
  attribute Function ontimeout;
}

IDBDatabase {
  ...
  IDBDynamicTransaction openDynamicTransaction();
  ...
}

This interface allows asynchronously requesting more objectStores to
be locked. The author must take care whenever calling openObjectStores
that the request might fail due to deadlocks.

But as previously stated, I think this adds too much complexity and
too much racyness to the API. And so I'd prefer to not add this.

> Secondly, where static
> scope is desired, in order to run to completion without being aborted due to
> unavailable objects, this proposal ensures that locks are obtained at the
> time of creating the transaction.
>
> My examples illustrates this approach. It uses both dynamic and static
> scoped transactions. The dynamic scope transactions are synchronously
> created since no locks have to be obtained when creating the transaction.

There might be some misunderstanding about how our proposal works. The
idea is that while you can synchronously create a IDBTransaction
object, no locks are acquired synchronously. Instead, the object
asynchronously requests the required locks to be acquired. It also
remembers all requests being placed against it. Not until all required
locks have been acquired will it create a database transaction and
start making requests into the database and firing 'success' events
against the previously returned IDBRequest objects.

This makes it possible to only have one level of asynchronous
callbacks exposed to the author. The implementation takes care of the
multiple asynchronous actions required in order to honor the authors
request.

This significantly simplifies the API as demonstrated in [2] vs [3]

[2] http://docs.google.com/document/pub?id=1I__XnwvvSwyjvxi-FAAE0ecnUDhk5DF7L2GI6O31o18
[3] http://docs.google.com/document/pub?id=1KKMAg_oHLeBvFUWND5km6FJtKi4jWxwKR0paKfZc8vU

>> One of our main points was to make getting objectStore
>> objects a synchronous operation as to avoid having to nest multiple
>> levels of asynchronous calls. Compare
>>
>> var req = db.openObjectStore("foo", trans);
>> req.onerror = errorHandler;
>> req.onsuccess = function(e) {
>>  var fooStore = e.result;
>>  var req = fooStore.get(12);
>>  req.onerror = errorHandler;
>>  req.onsuccess = resultHandler;
>> }
>>
>> to
>>
>> var fooStore = db.openObjectStore("foo", trans);
>> var req = fooStore.get(12);
>> req.onerror = errorHandler;
>> req.onsuccess = resultHandler;
>>
>>
>> I also don't understand the advantage of having the transaction as an
>> argument to openObjectStore rather than having openObjectStore live on
>> transaction. Compare
>>
>> db.openObjectStore("foo", trans);
>>
>> to
>>
>> trans.openObjectStore("foo");
>>
>> I also don't understand the meaning of specifying a mode when a
>> objectStore is opened, rather than specifying the mode when the
>> transaction is created.
>
> Have you reviewed the examples? Different object stores in a transaction are
> used in different modes, and that requires us to identify the mode when
> opening the object store. This also increases concurrency. This is
> particularly useful for dynamic transactions.

I'm following you better now. I do see how this can work for dynamic
transactions where locks are not acquired upon creation of the
transaction. But I don't see how this makes sense for static
transactions. And it indeed seems like you are not using this feature
for static transactions.

If it is the case that specifying a mode when opening an objectStore
only makes sense on dynamic transactions, then I think we should only
expose that argument on dynamic transactions.

Now that I understand your proposal better, I don't understand how
IDBTransaction.objectStore works for dynamically scoped transactions
in your proposal. It seems to require synchronously grabbing a lock
which I thought we wanted to avoid at all cost.

>> Unless we're planning on making all
>> transactions dynamic (I hope not), locks have to be grabbed when the
>> transaction is created, right? If a transaction is holding a READ_ONLY
>> lock for a given objectStore, then attempting to open that objectStore
>> as READ_WRITE should obviously fail. Consecutively, if a transaction
>> is holding a READ_WRITE lock for a given objectStore, then opening
>> that objectStore as READ_ONLY doesn't seem to have any benefit over
>> opening it as READ_WRITE. In short, I can't see any situation when
>> you'd want to open an objectStore in a different mode than what was
>> used when the transaction was created.
>>
>> Finally, I would stronly prefer to have READ_ONLY be the default
>> transaction type if none is specified by the author. It is better to
>> default to what results in higher performance, and have people notice
>> when they need to switch to the slower mode. This is because people
>> will very quickly notice if they use READ_ONLY when they need to use
>> READ_WRITE, since the code will never work. However if people use
>> READ_WRITE when all they need is READ_ONLY, then the only effect is
>> likely to be an application that runs somewhat slower, when they will
>> unlikely detect.
>
> This approach is also likely to cause exceptions upon put, remove, and add.
> I would prefer to not cause exceptions as the default behavior.

If we use READ_WRITE as default behavior then it's extremely likely
that people will use the wrong lock type and not realize. The downside
will be that sites will run less concurrently, and thus slower, than
they could. Another downside is that authors should specify lock-type
more often, for optimal performance, if we think that READ_ONLY is
more common.

If we are using READ_ONLY as default behavior, then it's extremely
likely that people will use the wrong lock type, notice that their
code isn't working, and fix it. The downside is that people will have
to fix their bugs. Another downside is that authors will have to
specify lock-type more often if we think that READ_WRITE is more
common.

To me the downsides of using READ_WRITE as a default are much worse
than the downsides of using READ_ONLY.

>> >  /*
>> >  Open the database catalog in the specified transaction for exclusive
>> > access.
>> >  Returns IDBRequest whose success event of IDBTransactionEvent type
>> > contains
>> >  a result with IDBCatalog and transaction is an IDBTransactionRequest.
>> >  */
>> >  IDBRequest openCatalog(in IDBTransaction txn);
>>
>> See above regarding catalog objects.
>>
>> >  /*
>> >  Create a new static scoped transaction asynchronously.
>> >  Returns IDBRequest whose success event of IDBSuccessEvent type contains
>> > a
>> >  result with IDBTransactionRequest.
>> >  */
>> >  IDBRequest openTransaction (in optional DOMStringList storeNames /*
>> > defaults to mean all object stores */,
>> >  in optional unsigned short mode /* defaults to READ_WRITE */,
>> >  in optional IDBTransaction parent /* defaults to null */,
>> >  in optional unsigned long timeout /* defaults to no timeout*/);
>> >  /*
>> >  Create a new dynamic scoped transaction. This returns a transaction
>> > handle
>> >  synchronously.
>> >  */
>> >  IDBTransaction transaction (in optional IDBTransaction parent /*
>> > defaults to null */,
>> >  in optional unsigned long timeout /* defaults to no timeout*/);
>>
>> I don't understand this. When would you ever want to use the
>> asynchronous version?
>
> See my examples.

I understand now.

>> Why can't you specify a list of objectStores in the synchronous version?
>
> Because I don't see what is the benefit of doing that if you can't obtain
> locks, which synchronous behavior cannot.

See my example above how our proposals allow you to specify which
objectStores should be locked, while still allowing the locks to be
acquired asynchronously.

>> Do we really need nested transactions? I do see that you added them in
>> latest revision of the editor drafts, but it seems like a fairly
>> advanced feature that would be nice to leave for a v2. If everyone
>> else wants them, I can check how much work it would be for us to
>> implement them, but my vote is to leave it for a later version.
>
> I added them because they are useful in advanced scenarios. I am fine with
> leaving them to a later date.

I would prefer to leave them out for now.

/ Jonas
Received on Wednesday, 7 July 2010 07:27:56 UTC