Re: [IndexedDB] transaction order from Jonas Sicking on 2011-11-18 (public-webapps@w3.org from October to December 2011)

From: Jonas Sicking <jonas@sicking.cc>
Date: Fri, 18 Nov 2011 13:37:46 -0800
To: Israel Hilerio <israelh@microsoft.com>
Cc: "public-webapps@w3.org" <public-webapps@w3.org>, Jim Wordelman <jaword@microsoft.com>, Adam Herchenroether <aherchen@microsoft.com>, Victor Ngo <vicngo@microsoft.com>
Message-ID: <CA+c2ei_BTbt1JCE=s979D4UpvODHT9GQvxx=Q5aguFv9oFu28A@mail.gmail.com>
On Mon, Oct 24, 2011 at 10:22 AM, Israel Hilerio <israelh@microsoft.com> wrote:
> On Friday, October 14, 2011 6:42 PM, Jonas Sicking wrote:
>> On Fri, Oct 14, 2011 at 1:51 PM, Israel Hilerio <israelh@microsoft.com>
>> wrote:
>> > On Friday, October 07, 2011 4:35 PM, Israel Hilerio wrote:
>> >> On Friday, October 07, 2011 2:52 PM, Jonas Sicking wrote:
>> >> > Hi All,
>> >> >
>> >> > There is one edge case regarding transaction scheduling that we'd
>> >> > like to get clarified.
>> >> >
>> >> > As the spec is written, it's clear what the following code should do:
>> >> >
>> >> > trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> > trans1.objectStore("foo").put("value 1", "mykey");
>> >> > trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> > trans2.objectStore("foo").put("value 2", "mykey");
>> >> >
>> >> > In this example it's clear that the implementation should first run
>> >> > trans1 which will put the value "value 1" in object store "foo" at
>> >> > key "mykey". The implementation should then run trans2 which will
>> >> > write overwrite the same value with "value 2". The end result is
>> >> > that "value 2" is the value that lives in the object store.
>> >> >
>> >> > Note that in this case it's not at all ambiguous which transaction runs
>> first.
>> >> > Since the two transactions have overlapping scope, trans2 won't
>> >> > even start until trans1 is committed. Even if we made the code something
>> like:
>> >> >
>> >> > trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> > trans1.objectStore("foo").put("value 1", "mykey");
>> >> > trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> > trans2.objectStore("foo").put("value 2", "mykey");
>> >> > trans1.objectStore("foo").put("value 3", "mykey");
>> >> >
>> >> > we'd get the same result. Both put requests placed against trans1
>> >> > will run first while trans2 is waiting for trans1 to commit before
>> >> > it begins running since they have overlapping scopes.
>> >> >
>> >> > However, consider the following example:
>> >> > trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> > trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> > trans2.objectStore("foo").put("value 2", "mykey");
>> >> > trans1.objectStore("foo").put("value 1", "mykey");
>> >> >
>> >> > In this case, while trans1 is created first, no requests are placed
>> >> > against it, and so no database operations are started. The first
>> >> > database operation that is requested is one placed against trans2.
>> >> > In the firefox implementation, this makes trans2 run before trans1. I.e.
>> >> > we schedule transactions when the first request is placed against
>> >> > them, and not when the IDBDatabase.transaction() function returns.
>> >> >
>> >> > The advantage of firefox approach is obvious in code like this:
>> >> >
>> >> > someElement.onclick = function() {
>> >> >   trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> >   ...
>> >> >   trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> >   trans2.objectStore.put("some value", "mykey");
>> >> >   callExpensiveFunction();
>> >> > }
>> >> >
>> >> > In this example no requests are placed against trans1. However
>> >> > since
>> >> > trans1 is supposed to run before trans2 does, we can't send off any
>> >> > work to the database at the time when the .put call happens since
>> >> > we don't yet know if there will be requests placed against trans1.
>> >> > Only once we return to the event loop at the end of the onclick
>> >> > handler will
>> >> trans1 be "committed"
>> >> > and the requests in trans2 can be sent to the database.
>> >> >
>> >> > However, the downside with firefox approach is that it's harder for
>> >> > applications to control which order transactions are run. Consider
>> >> > for example a program is parsing a big hunk of binary data. Before
>> >> > parsing, the program starts two transactions, one READ_WRITE and
>> >> > one READ_ONLY. As the binary data is interpreted, the program
>> >> > issues write requests against the READ_WRITE transactions and read
>> >> > requests against the READ_ONLY transaction. The idea being that the
>> >> > read requests will always run after the write requests to read from
>> >> > database after all the parsed data has been written. In this setup
>> >> > the firefox approach isn't as good since it's less predictable
>> >> > which transaction will run first as it might depend on the binary
>> >> > data being parsed. Of course, you could force the writing
>> >> > transaction to run first by placing a request
>> >> against it after it has been created.
>> >> >
>> >> > I am however not able to think of any concrete examples of the
>> >> > above binary data structure that would require this setup.
>> >> >
>> >> > So the question is, which solution do you think we should go with.
>> >> > One thing to remember is that there is a very small difference
>> >> > between the two approaches here. It only makes a difference in edge
>> >> > cases. The edge case being that a transaction is created, but no
>> >> > requests are placed against it until another transaction, with
>> >> > overlapping scope, is
>> >> created.
>> >> >
>> >> > Firefox approach has strictly better performance in this edge case.
>> >> > However it could also have somewhat surprising results.
>> >> >
>> >> > I personally don't feel strongly either way. I also think it's rare
>> >> > to make a difference one way or another as it'll be rare for people
>> >> > to hit this
>> >> edge case.
>> >> >
>> >> > But we should spell things out clearly in the spec which approach
>> >> > is the conforming one.
>> >> >
>> >> > / Jonas
>> >> >
>> >>
>> >> In IE, the transaction that is first created locks the object stores
>> >> associated with it.
>> >> Therefore in the scenario outlined by Jonas:
>> >>
>> >> trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
>> >> trans2.objectStore("foo").put("value 2", "mykey");
>> >> trans1.objectStore("foo").put("value 1", "mykey");
>> >>
>> >> The put on trans1 will be done first before the put on trans2.  The
>> >> reason is that trans2 will not be able to grab a lock on object store
>> >> "foo" until all pending requests for trans1 are executed.
>> >>
>> >> This is the expectation our internal partners have been following.
>> >> In general, we expect devs to use transactions as they create them.
>> >> We would like this to be the spec'ed behavior.
>> >>
>> >> Israel
>> >
>> > If we agree on this, should we add the following text to section 3.1.7 to
>> capture this restriction:
>> > "A transaction must not start until all other READ_WRITE transactions with
>> overlapping scope have completed. When multiple transactions are eligible to
>> be started, older transactions should be started first."
>>
>> I checked in a fix which I believe takes care of this.
>>
>> I did word the requirement somewhat differently though. The reason is that I
>> wanted to allow implementations to optimize a little bit heavier than the
>> above text. Consider for example two READ_WRITE transactions. The first one
>> created has a scope of just objectStore A.
>> The second one to be created has a scope of objectStores A and B.
>>
>> The first transaction will obviously start first. However, if the second
>> transaction first touches objectStore B to read and write some data there,
>> there really is no reason we couldn't let the implementation start that
>> transaction. If the second transaction then touches objectStore A, we would
>> have to hold all callbacks to that transaction until the first transaction is
>> finished.
>>
>> But in this scenario the two transactions can actually run partially in parallel.
>>
>> You could even imagine that the second transaction never touches objectStore
>> A, even though it included it in its scope. In this case the implementation
>> could even let the second transaction run to completion and commit before
>> the first transaction finishes.
>>
>> This isn't something that we've implemented in firefox, and not something
>> that we foresee implementing anytime soon. But it might very well be worth
>> implementing in the future once we see more IndexedDB usage in the wild.
>>
>> Hope that makes sense.
>>
>> / Jonas
>
> Your point makes sense to us and we like the new wording.  Did you already update the spec?

Yup, this is already in the spec.

/ Jonas
Received on Friday, 18 November 2011 21:38:44 UTC