RE: [IndexedDB] transaction order from Israel Hilerio on 2011-10-24 (public-webapps@w3.org from October to December 2011)

From: Israel Hilerio <israelh@microsoft.com>
Date: Mon, 24 Oct 2011 17:22:25 +0000
To: Jonas Sicking <jonas@sicking.cc>
CC: "public-webapps@w3.org" <public-webapps@w3.org>, Jim Wordelman <jaword@microsoft.com>, Adam Herchenroether <aherchen@microsoft.com>, "Victor Ngo" <vicngo@microsoft.com>
Message-ID: <F695AF7AA77CC745A271AD0F61BBC61E3F515BD6@TK5EX14MBXC115.redmond.corp.microsoft.>
On Friday, October 14, 2011 6:42 PM, Jonas Sicking wrote:
> On Fri, Oct 14, 2011 at 1:51 PM, Israel Hilerio <israelh@microsoft.com>
> wrote:
> > On Friday, October 07, 2011 4:35 PM, Israel Hilerio wrote:
> >> On Friday, October 07, 2011 2:52 PM, Jonas Sicking wrote:
> >> > Hi All,
> >> >
> >> > There is one edge case regarding transaction scheduling that we'd
> >> > like to get clarified.
> >> >
> >> > As the spec is written, it's clear what the following code should do:
> >> >
> >> > trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> > trans1.objectStore("foo").put("value 1", "mykey");
> >> > trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> > trans2.objectStore("foo").put("value 2", "mykey");
> >> >
> >> > In this example it's clear that the implementation should first run
> >> > trans1 which will put the value "value 1" in object store "foo" at
> >> > key "mykey". The implementation should then run trans2 which will
> >> > write overwrite the same value with "value 2". The end result is
> >> > that "value 2" is the value that lives in the object store.
> >> >
> >> > Note that in this case it's not at all ambiguous which transaction runs
> first.
> >> > Since the two transactions have overlapping scope, trans2 won't
> >> > even start until trans1 is committed. Even if we made the code something
> like:
> >> >
> >> > trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> > trans1.objectStore("foo").put("value 1", "mykey");
> >> > trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> > trans2.objectStore("foo").put("value 2", "mykey");
> >> > trans1.objectStore("foo").put("value 3", "mykey");
> >> >
> >> > we'd get the same result. Both put requests placed against trans1
> >> > will run first while trans2 is waiting for trans1 to commit before
> >> > it begins running since they have overlapping scopes.
> >> >
> >> > However, consider the following example:
> >> > trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> > trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> > trans2.objectStore("foo").put("value 2", "mykey");
> >> > trans1.objectStore("foo").put("value 1", "mykey");
> >> >
> >> > In this case, while trans1 is created first, no requests are placed
> >> > against it, and so no database operations are started. The first
> >> > database operation that is requested is one placed against trans2.
> >> > In the firefox implementation, this makes trans2 run before trans1. I.e.
> >> > we schedule transactions when the first request is placed against
> >> > them, and not when the IDBDatabase.transaction() function returns.
> >> >
> >> > The advantage of firefox approach is obvious in code like this:
> >> >
> >> > someElement.onclick = function() {
> >> >   trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> >   ...
> >> >   trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> >   trans2.objectStore.put("some value", "mykey");
> >> >   callExpensiveFunction();
> >> > }
> >> >
> >> > In this example no requests are placed against trans1. However
> >> > since
> >> > trans1 is supposed to run before trans2 does, we can't send off any
> >> > work to the database at the time when the .put call happens since
> >> > we don't yet know if there will be requests placed against trans1.
> >> > Only once we return to the event loop at the end of the onclick
> >> > handler will
> >> trans1 be "committed"
> >> > and the requests in trans2 can be sent to the database.
> >> >
> >> > However, the downside with firefox approach is that it's harder for
> >> > applications to control which order transactions are run. Consider
> >> > for example a program is parsing a big hunk of binary data. Before
> >> > parsing, the program starts two transactions, one READ_WRITE and
> >> > one READ_ONLY. As the binary data is interpreted, the program
> >> > issues write requests against the READ_WRITE transactions and read
> >> > requests against the READ_ONLY transaction. The idea being that the
> >> > read requests will always run after the write requests to read from
> >> > database after all the parsed data has been written. In this setup
> >> > the firefox approach isn't as good since it's less predictable
> >> > which transaction will run first as it might depend on the binary
> >> > data being parsed. Of course, you could force the writing
> >> > transaction to run first by placing a request
> >> against it after it has been created.
> >> >
> >> > I am however not able to think of any concrete examples of the
> >> > above binary data structure that would require this setup.
> >> >
> >> > So the question is, which solution do you think we should go with.
> >> > One thing to remember is that there is a very small difference
> >> > between the two approaches here. It only makes a difference in edge
> >> > cases. The edge case being that a transaction is created, but no
> >> > requests are placed against it until another transaction, with
> >> > overlapping scope, is
> >> created.
> >> >
> >> > Firefox approach has strictly better performance in this edge case.
> >> > However it could also have somewhat surprising results.
> >> >
> >> > I personally don't feel strongly either way. I also think it's rare
> >> > to make a difference one way or another as it'll be rare for people
> >> > to hit this
> >> edge case.
> >> >
> >> > But we should spell things out clearly in the spec which approach
> >> > is the conforming one.
> >> >
> >> > / Jonas
> >> >
> >>
> >> In IE, the transaction that is first created locks the object stores
> >> associated with it.
> >> Therefore in the scenario outlined by Jonas:
> >>
> >> trans1 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> trans2 = db.transaction(["foo"], IDBTransaction.READ_WRITE);
> >> trans2.objectStore("foo").put("value 2", "mykey");
> >> trans1.objectStore("foo").put("value 1", "mykey");
> >>
> >> The put on trans1 will be done first before the put on trans2.  The
> >> reason is that trans2 will not be able to grab a lock on object store
> >> "foo" until all pending requests for trans1 are executed.
> >>
> >> This is the expectation our internal partners have been following.
> >> In general, we expect devs to use transactions as they create them.
> >> We would like this to be the spec'ed behavior.
> >>
> >> Israel
> >
> > If we agree on this, should we add the following text to section 3.1.7 to
> capture this restriction:
> > "A transaction must not start until all other READ_WRITE transactions with
> overlapping scope have completed. When multiple transactions are eligible to
> be started, older transactions should be started first."
> 
> I checked in a fix which I believe takes care of this.
> 
> I did word the requirement somewhat differently though. The reason is that I
> wanted to allow implementations to optimize a little bit heavier than the
> above text. Consider for example two READ_WRITE transactions. The first one
> created has a scope of just objectStore A.
> The second one to be created has a scope of objectStores A and B.
> 
> The first transaction will obviously start first. However, if the second
> transaction first touches objectStore B to read and write some data there,
> there really is no reason we couldn't let the implementation start that
> transaction. If the second transaction then touches objectStore A, we would
> have to hold all callbacks to that transaction until the first transaction is
> finished.
> 
> But in this scenario the two transactions can actually run partially in parallel.
> 
> You could even imagine that the second transaction never touches objectStore
> A, even though it included it in its scope. In this case the implementation
> could even let the second transaction run to completion and commit before
> the first transaction finishes.
> 
> This isn't something that we've implemented in firefox, and not something
> that we foresee implementing anytime soon. But it might very well be worth
> implementing in the future once we see more IndexedDB usage in the wild.
> 
> Hope that makes sense.
> 
> / Jonas

Your point makes sense to us and we like the new wording.  Did you already update the spec?

Israel
Received on Monday, 24 October 2011 17:22:56 UTC