Re: [IndexedDB] transaction order

On Fri, Oct 14, 2011 at 1:51 PM, Israel Hilerio <israelh@microsoft.com> wrote:
> On Friday, October 07, 2011 4:35 PM, Israel Hilerio wrote:
>> On Friday, October 07, 2011 2:52 PM, Jonas Sicking wrote:
>> > Hi All,
>> >
>> > There is one edge case regarding transaction scheduling that we'd like
>> > to get clarified.
>> >
>> > As the spec is written, it's clear what the following code should do:
>> >
>> > trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> > trans1.objectStore("foo").put("value 1", "mykey");
>> > trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> > trans2.objectStore("foo").put("value 2", "mykey");
>> >
>> > In this example it's clear that the implementation should first run
>> > trans1 which will put the value "value 1" in object store "foo" at key
>> > "mykey". The implementation should then run trans2 which will write
>> > overwrite the same value with "value 2". The end result is that "value
>> > 2" is the value that lives in the object store.
>> >
>> > Note that in this case it's not at all ambiguous which transaction runs first.
>> > Since the two transactions have overlapping scope, trans2 won't even
>> > start until trans1 is committed. Even if we made the code something like:
>> >
>> > trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> > trans1.objectStore("foo").put("value 1", "mykey");
>> > trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> > trans2.objectStore("foo").put("value 2", "mykey");
>> > trans1.objectStore("foo").put("value 3", "mykey");
>> >
>> > we'd get the same result. Both put requests placed against trans1 will
>> > run first while trans2 is waiting for trans1 to commit before it
>> > begins running since they have overlapping scopes.
>> >
>> > However, consider the following example:
>> > trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> > trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> > trans2.objectStore("foo").put("value 2", "mykey");
>> > trans1.objectStore("foo").put("value 1", "mykey");
>> >
>> > In this case, while trans1 is created first, no requests are placed
>> > against it, and so no database operations are started. The first
>> > database operation that is requested is one placed against trans2. In
>> > the firefox implementation, this makes trans2 run before trans1. I.e.
>> > we schedule transactions when the first request is placed against
>> > them, and not when the IDBDatabase.transaction() function returns.
>> >
>> > The advantage of firefox approach is obvious in code like this:
>> >
>> > someElement.onclick = function() {
>> >   trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >   ...
>> >   trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >   trans2.objectStore.put("some value", "mykey");
>> >   callExpensiveFunction();
>> > }
>> >
>> > In this example no requests are placed against trans1. However since
>> > trans1 is supposed to run before trans2 does, we can't send off any
>> > work to the database at the time when the .put call happens since we
>> > don't yet know if there will be requests placed against trans1. Only
>> > once we return to the event loop at the end of the onclick handler will
>> trans1 be "committed"
>> > and the requests in trans2 can be sent to the database.
>> >
>> > However, the downside with firefox approach is that it's harder for
>> > applications to control which order transactions are run. Consider for
>> > example a program is parsing a big hunk of binary data. Before
>> > parsing, the program starts two transactions, one READ_WRITE and one
>> > READ_ONLY. As the binary data is interpreted, the program issues write
>> > requests against the READ_WRITE transactions and read requests against
>> > the READ_ONLY transaction. The idea being that the read requests will
>> > always run after the write requests to read from database after all
>> > the parsed data has been written. In this setup the firefox approach
>> > isn't as good since it's less predictable which transaction will run
>> > first as it might depend on the binary data being parsed. Of course,
>> > you could force the writing transaction to run first by placing a request
>> against it after it has been created.
>> >
>> > I am however not able to think of any concrete examples of the above
>> > binary data structure that would require this setup.
>> >
>> > So the question is, which solution do you think we should go with. One
>> > thing to remember is that there is a very small difference between the
>> > two approaches here. It only makes a difference in edge cases. The
>> > edge case being that a transaction is created, but no requests are
>> > placed against it until another transaction, with overlapping scope, is
>> created.
>> >
>> > Firefox approach has strictly better performance in this edge case.
>> > However it could also have somewhat surprising results.
>> >
>> > I personally don't feel strongly either way. I also think it's rare to
>> > make a difference one way or another as it'll be rare for people to hit this
>> edge case.
>> >
>> > But we should spell things out clearly in the spec which approach is
>> > the conforming one.
>> >
>> > / Jonas
>> >
>>
>> In IE, the transaction that is first created locks the object stores associated
>> with it.
>> Therefore in the scenario outlined by Jonas:
>>
>> trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> trans2.objectStore("foo").put("value 2", "mykey");
>> trans1.objectStore("foo").put("value 1", "mykey");
>>
>> The put on trans1 will be done first before the put on trans2.  The reason is
>> that trans2 will not be able to grab a lock on object store "foo" until all
>> pending requests for trans1 are executed.
>>
>> This is the expectation our internal partners have been following.  In general,
>> we expect devs to use transactions as they create them.  We would like this to
>> be the spec'ed behavior.
>>
>> Israel
>
> If we agree on this, should we add the following text to section 3.1.7 to capture this restriction:
> "A transaction must not start until all other READ_WRITE transactions with overlapping scope have completed. When multiple transactions are eligible to be started, older transactions should be started first."

I checked in a fix which I believe takes care of this.

I did word the requirement somewhat differently though. The reason is
that I wanted to allow implementations to optimize a little bit
heavier than the above text. Consider for example two READ_WRITE
transactions. The first one created has a scope of just objectStore A.
The second one to be created has a scope of objectStores A and B.

The first transaction will obviously start first. However, if the
second transaction first touches objectStore B to read and write some
data there, there really is no reason we couldn't let the
implementation start that transaction. If the second transaction then
touches objectStore A, we would have to hold all callbacks to that
transaction until the first transaction is finished.

But in this scenario the two transactions can actually run partially
in parallel.

You could even imagine that the second transaction never touches
objectStore A, even though it included it in its scope. In this case
the implementation could even let the second transaction run to
completion and commit before the first transaction finishes.

This isn't something that we've implemented in firefox, and not
something that we foresee implementing anytime soon. But it might very
well be worth implementing in the future once we see more IndexedDB
usage in the wild.

Hope that makes sense.

/ Jonas

Received on Saturday, 15 October 2011 01:42:46 UTC