Re: [IndexedDB] Transaction ordering for readonly transactions from Joshua Bell on 2014-03-10 (public-webapps@w3.org from January to March 2014)

From: Joshua Bell <jsbell@google.com>
Date: Mon, 10 Mar 2014 10:35:36 -0700
To: Jonas Sicking <jonas@sicking.cc>
Cc: Webapps WG <public-webapps@w3.org>
Message-ID: <CAD649j7hoPvDo1hP81iPWMCtJONB3P4axA_K8do__=ub_F_f3A@mail.gmail.com>
On Fri, Mar 7, 2014 at 5:24 PM, Jonas Sicking <jonas@sicking.cc> wrote:

> Hi all,
>
> Currently the IndexedDB spec has strict requirements around the
> ordering for readwrite transactions. The spec says:
>
> "If multiple "readwrite" transactions are attempting to access the
> same object store (i.e. if they have overlapping scope), the
> transaction that was created first MUST be the transaction which gets
> access to the object store first"
>
> However there is very little language about the order in which
> readonly transactions should run. Specifically, there is nothing that
> says that if a readonly transaction is created after a readwrite
> transaction, that the readonly transaction runs after the readwrite
> transaction. This is true even if the two transactions have
> overlapping scopes.
>
> Chrome apparently takes advantage of this and actually sometimes runs
> readonly transactions before a readwrite transaction, even if the
> readonly transaction was created after the readwrite transaction.
>

That is correct.


>
> This means that a readonly transaction that's started after a
> readwrite transaction may or may not see the data that was written by
> the readwrite transaction.
>
> This does seem like a nice optimization. Especially for
> implementations that use MVCC since it means that it can run the
> readonly and the readwrite transaction in parallel.
>

Another benefit is that a connection that's issuing a series of readonly
transactions won't suddenly pause just because a different connection in
another page is starting a readwrite transaction.

>
> However I think the result is a bit confusing. I'm not so much worried
> that the fact that people will get callbacks in a different order
> matters. Even though in theory those callbacks could have sideeffects
> that will now happen in a different order. The more concerning thing
> is that the page will see different data in the database.
>
> One example of confusion is in this github thread:
>
> https://github.com/js-platform/filer/issues/128#issuecomment-36633317
>
> This is a library which implements a filesystem API on top of IDB. Due
> to this optimization, writing a file and then checking if it exists
> may or may not succeed depending on if the transactions got reordered
> or not.
>
>
And we (Chrome) have also had developer feedback that allowing readonly
transactions to "slip ahead' of busy/blocked readwrite transactions is
surprising.

That said, developers (1) have been quick to understand that implicit
transaction ordering should be made explicit by not creating dependent
transactions until the previous one has actually completed - and probably
fixing some application logic bugs at the same time, and (2) have taken
advantage of readonly transactions not blocking on readwrite transactions,
achieving much higher throughput without implementing their own data
caching layer.

So.... I'm definitely of two minds here. Removing this optimization will
help developers in simple cases, but would hinder larger scale web apps.
Other opinions?


> I'd like to strengthen the default ordering requirements and say that
> two transactions must run in the order they were created if they have
> overlapping scopes and either of them is a readwrite transaction.
>
> But I'd be totally open to adding some syntax to opt in to more
> flexible transaction ordering. Possibly by introducing a new
> transaction type.
>

Making the complexity opt-in sounds like a reasonable compromise.


>
> Btw, when are we starting officially working on IDB v2? :)
>

ASAP! We've got some things implemented behind experimental flags in Chrome
(binary keys, continuing-on-primary-key, etc) and want to push forward with
more details on events, storage types (persistent vs. temporary) etc.
Perhaps a topic for the F2F next month (offline or during the meeting?)
would be "current best practices for 'v2' specs"?


>
> / Jonas
>
>
Received on Monday, 10 March 2014 17:36:12 UTC