Re: [IndexedDB] Current editor's draft from Jonas Sicking on 2010-07-09 (public-webapps@w3.org from July to September 2010)

From: Jonas Sicking <jonas@sicking.cc>
Date: Fri, 9 Jul 2010 16:09:18 -0700
To: Nikunj Mehta <nikunj@o-micron.com>
Cc: public-webapps <public-webapps@w3.org>
Message-ID: <AANLkTim0XuYWeOAtnzOhOXhWVa1ohLsrWl747usinBpN@mail.gmail.com>
On Fri, Jul 9, 2010 at 11:27 AM, Nikunj Mehta <nikunj@o-micron.com> wrote:
>
> On Jul 7, 2010, at 12:57 PM, Jonas Sicking wrote:
>
>>>>> 2. Provide a catalog object that can be used to atomically add/remove
>>>>> object stores and indexes as well as modify version.
>>>>
>>>> It seems to me that a catalog object doesn't really provide any
>>>> functionality over the proposal in bug 10052? The advantage that I see
>>>> with the syntax proposal in bug 10052 is that it is simpler.
>>>>
>>>> http://www.w3.org/Bugs/Public/show_bug.cgi?id=10052
>>>>
>>>> Can you elaborate on what the advantages are of catalog objects?
>>>
>>> To begin with, 10052 shuts down the "users" of the database completely when
>>> only one is changing its structure, i.e., adding or removing an object
>>> store.
>>
>> This is not the case. Check the steps defined for setVersion in [1].
>> At no point are databases shut down automatically. Only once all
>> existing database connections are manually closed, either by calls to
>> IDBDatabase.close() or by the user leaving the page, is the 'success'
>> event from setVersion fired.
>
> Can you justify why one should be forced to stop using the database when someone else is adding an object store or an index? This is what I meant by draconian.

In the thread "[IndexedDB] Atomic schema changes" the conclusion was
that while it's possible to create API that allows compatible schema
changes without calling setVersion, it wasn't worth the added API
complexity. It's something that can wait for v2. Are calls to add
indexes and objectStores without further incompatible changes common
enough that we need to optimize the case when users have multiple tabs
open? Remember that calling setVersion really is no overhead if only
one copy of the application is running.

>> [1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=10052#c0
>>
>>> How can we make it less draconian?
>>
>> The 'versionchange' event allows pages that are currently using the
>> database to handle the change. The page can inspect the new version
>> number supplied by the 'versionchange' event, and if it knows that it
>> is compatible with a given upgrade, all it needs to do is to call
>> db.close() and then immediately reopen the database using
>> indexedDB.open(). The open call won't complete until the upgrade is
>> finished.
>>
>>> Secondly, I don't see how that
>>> approach can produce atomic changes to the database.
>>
>> When the transaction created in step 4 of setVersion defined in [1] is
>> created, only one IDBDatabase object to the database is open. As long
>> as that transaction is running, no requests returned from
>> IDBFactory.open will receive a 'success' event. Only once the
>> transaction is committed, or aborted, will those requests succeed.
>> This guarantees that no other IDBDatabase object can see a partial
>> update.
>>
>> Further, only once the transaction created by setVersion is committed,
>> are the requested objectStores and indexes created/removed. This
>> guarantees that the database is never left with a partial update.
>>
>> That means that the changes are atomic, right?
>
> Atomic is not the same is isolated. Merely the fact that no other use of the database was being made when you are changing its structure doesn't mean that you will get all of the changes or none. What happens, for example, if the browser crashes in the middle of the versionRequest.onsuccess handler?

Yeah, I forgot one critical piece of information. Like with any other
transaction. If the transaction isn't successfully committed, it is
fully rolled back and no changes whatsoever are made to the database.

>>> Thirdly, we shouldn't
>>> need to change version in order to perform database changes.
>>
>> First off, note that if the upgrade is compatible, you can just pass
>> the existing database version to setVersion. So no version *change* is
>> actually needed.
>>
>> Second, I don't think there is much difference between
>>
>> var txn = db.transaction();
>> db.openCatalog(txn).onsuccess = ...
>>
>> vs
>>
>> db.setVersion("5").onsuccess = ...
>>
>> I don't see that telling people that they have to use the former is a big win.
>>
>>
>> The problem that I see with the catalog proposal, if I understand it
>> correctly, is that it means that a page that has a IDBDatabase object
>> open has to always be prepared for calls to
>> openObjectStore/openTransaction failing. I.e. the page can't ever know
>> that another page was opened which at any point created a catalog and
>> removed an objectStore. This forces pages to at every single call
>> either check that the version is still the same, or that each and
>> every call to openObjectStore/openTransaction succeeds. This seems
>> very error prone to me.
>
> We could easily add a condition check to removing an object store so that there are no open transactions holding a lock on that object store. This would prevent any errors of the kind you describe.

I don't see that that would help. You're still in the situation that
every time you're starting a transaction you have to check that all
the objectStores you were expecting to be there still are there. Am I
missing something?

>> Looking at your example, it also seems like it contains a race
>> condition. There is a risk that when someone opens a database, the
>> first transaction, which uses a catalog to create the necessary
>> objectStores and indexes, is committed, but the second transaction,
>> which populates the objectStores with data, has not yet started.
>
> I purposely wrote my example to allow database to be populated separately from the creation of the database. There is no reason why the two couldn't be done in the same transaction, though.

I see.

>>> Finally, I am
>>> not sure why you consider the syntax proposal simpler. Note that I am not
>>> averse to the version change event notification.
>>
>> Compare to how your code would look like with the proposals in bugs
>> 9975 and 10052:
>>
>> var db;
>> var dbRequest = indexedDB.open("parts", 'Part database');
>> dbRequest.onsuccess = function(event) {
>> db = event.result;
>> if (db.version != "1") {
>>   versionRequest = db.setVersion("1");
>>   versionRequest.ontimeout = function(event) {
>>     throw new Error("timeout while waiting for other DBs to close");
>>   };
>>   versionRequest.onsuccess = function(event) {
>>     event.transaction.onerror = function(event) {
>>       throw new Error("Failed to set up database due to error: " +
>> event.message);
>>     };
>>     db.setVersion("1");
>>     part = db.createObjectStore("part", "number", false);
>>     supplier = db.createObjectStore("supplier", "number", false);
>>     shipment = db.createObjectStore("shipment", id);
>>     part.createIndex("partName", "name");
>>     part.createIndex("partColor", "color");
>>     part.createIndex("partCity", "city");
>>     supplier.createIndex("supplierStatus", "status");
>>     supplier.createIndex("supplierCity", "city");
>>     shipment.createIndex("partSupplier", "[part, supplier]");
>>     var data = [{
>>       store: part, values: parts}, {
>>       store: supplier, values: suppliers}, {
>>       store: shipment, values: shipments}];
>>     for (var storeIndex = 0; storeIndex < data.length; ++storeIndex) {
>>       var task = data[storeIndex];
>>       for (var valueIndex = 0; valueIndex < task.values.length;
>> ++valueIndex) {
>>         task.store.add(task.values[index]); // Ignoring the
>> possibility that there may be errors while adding
>>       }
>>     }
>>   };
>> }
>> };
>>
>> dbRequest.onerror = function(event) {
>> if (event.code === IDBDatabaseException.UNKNOWN_ERR)
>> throw new Error("Could not open the database");
>> }
>>
>> There are a few advantages. First of all all the operations can easily
>> be done in a single transaction, including populating the
>> objectStores. Second, no need for a separate catalog object. Third,
>> the index creator functions can live directly on the objectStore
>> interface. Fourth, there is less nesting of asynchronous callbacks.
>
> It seems you are saying that object stores, indices, and the data population can all work synchronously and that the returned objects are the results of the completion of the request. This contradicts your proposal where both createObjectStore and createIndex return IDBRequest objects. If the proposal is correct, then your example is incorrect. Can you clarify this so that all of us can assess the "simplicity" of the two proposals?

Please read the proposal in bug 10052. It makes createObjectStore and
createIndex synchronously return the created objects. IDBRequests are
not returned. However at no point is data population done
synchronously. Note that in my example above all add() calls
asynchronously adds data to the table. Not until all data has been
inserted will the transaction be committed.

The proposal in bug 10052 subsumes some of the aspects of the proposal
we originally made and which resulted in bug 9975 being filed. These
changes were made due to you correctly finding that our original
proposal did not allow atomic schema changes.

However everything else that we originally proposed still stands. But
for clarity I have now updated the google document to reflect the
changes proposed in bug 10052.

> On Jul 9, 2010, at 10:01 PM, Jonas Sicking wrote:
>
>>>>
>>>> There seems to be much agreement on the proposals in bug 9975 and
>>>> 10052. Most of the disagreements seems to be stemming from
>>>> misunderstandings how the proposal works, such as weather atomic
>>>> schema transactions are possible, and if the synchronous transaction
>>>> API requires synchronously grabbing locks.
>>>>
>>>> I propose that we go ahead and implement in the specification the
>>>> changes proposed in bug 9975 and 10052 and use that as basis for
>>>> further discussions. We've already found a few issues in those
>>>> proposals, such as the lack of support for dynamic transactions (if we
>>>> want to keep those), but I think it will be easier to discuss the
>>>> changes that are needed in terms of changes needed, rather than the
>>>> significantly different API that is in the draft now.
>>>>
>>>
>>> Agreed. I think your proposal in 9975 should be the basis for the v1
>>> of this spec.
>>
>> Just to be clear. I expect that we'll need to do significant changes
>> even on top of our proposal. I'm sure there are things that we have
>> missed, and that there are things that can be done better than the way
>> we've proposed. However it seems like it's closer to what we want than
>> what we currently have, and so we should use it as basis for further
>> discussions.
>
> I think the question of catalog object vs. versionchange event is far from settled. I don't see how making that change will be superior to me suggesting that we change the spec to the catalog proposal.

It means that applications can be sure that as long as they have a
IDBDatabase object open, no objectStores or indexes that they depend
on will be removed.
It allows for synchronous creation/removal of objectStores and indexes.
It has a cleaner syntax since you don't have to create the transaction
and the catalog separately. It also is cleaner since no separate
catalog object is needed.
It defaults to safe schema upgrades if no actions are taken on the
account of the user.

>>>> If this sounds ok to everyone, I can write these changes up monday.
>>>>
>>>
>>> Sounds ok, I will try to help you with this.
>>
>> Awesome!
>
> I hope you can answer my questions above before making these changes to the spec.

I'm fine with waiting a little bit, which is part of why I suggested
monday, but it has now been almost 2 months since we made our async
API proposal. We have received almost exclusively positive feedback
for it. We have also showed that it is implementable by shipping it in
Firefox 4 beta 1. The little negative feedback that we have received
seems based on misunderstandings about how the proposal works. The two
problems that have been pointed out, atomic schema changes and dynamic
transactions, have been immediately addressed. This is why I suggest
we put it into the specification with full details so that we can all
get on the same page on how it works and move on with revising it.

/ Jonas
Received on Friday, 9 July 2010 23:10:11 UTC