Re: [IndexedDB] Current editor's draft from Jeremy Orlow on 2010-07-23 (public-webapps@w3.org from July to September 2010)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Thu, 22 Jul 2010 20:18:54 -0400
To: Pablo Castro <Pablo.Castro@microsoft.com>
Cc: Jonas Sicking <jonas@sicking.cc>, Nikunj Mehta <nikunj@o-micron.com>, Andrei Popescu <andreip@google.com>, public-webapps <public-webapps@w3.org>
Message-ID: <AANLkTimMrtaxDPx2qa-6-u8pUZyeDwPnznB85BsuQq__@mail.gmail.com>
On Thu, Jul 22, 2010 at 7:41 PM, Pablo Castro <Pablo.Castro@microsoft.com>wrote:

>
> From: Jonas Sicking [mailto:jonas@sicking.cc]
> Sent: Thursday, July 22, 2010 11:27 AM
>
> >> On Thu, Jul 22, 2010 at 3:43 AM, Nikunj Mehta <nikunj@o-micron.com>
> wrote:
> >> >
> >> > On Jul 16, 2010, at 5:41 AM, Pablo Castro wrote:
> >> >
> >> >>
> >> >> From: jorlow@google.com [mailto:jorlow@google.com] On Behalf Of
> Jeremy Orlow
> >> >> Sent: Thursday, July 15, 2010 8:41 AM
> >> >>
> >> >> On Thu, Jul 15, 2010 at 4:30 PM, Andrei Popescu <andreip@google.com>
> wrote:
> >> >> On Thu, Jul 15, 2010 at 3:24 PM, Jeremy Orlow <jorlow@chromium.org>
> wrote:
> >> >>> On Thu, Jul 15, 2010 at 3:09 PM, Andrei Popescu <andreip@google.com>
> wrote:
> >> >>>>
> >> >>>> On Thu, Jul 15, 2010 at 9:50 AM, Jeremy Orlow <jorlow@chromium.org>
> wrote:
> >> >>>>>>>> Nikunj, could you clarify how locking works for the dynamic
> >> >>>>>>>> transactions proposal that is in the spec draft right now?
> >> >>>>>>>
> >> >>>>>>> I'd definitely like to hear what Nikunj originally intended
> here.
> >> >>>>>>>>
> >> >>>>>>
> >> >>>>>> Hmm, after re-reading the current spec, my understanding is that:
> >> >>>>>>
> >> >>>>>> - Scope consists in a set of object stores that the transaction
> operates
> >> >>>>>> on.
> >> >>>>>> - A connection may have zero or one active transactions.
> >> >>>>>> - There may not be any overlap among the scopes of all active
> >> >>>>>> transactions (static or dynamic) in a given database. So you
> cannot
> >> >>>>>> have two READ_ONLY static transactions operating simultaneously
> over
> >> >>>>>> the same object store.
> >> >>>>>> - The granularity of locking for dynamic transactions is not
> specified
> >> >>>>>> (all the spec says about this is "do not acquire locks on any
> database
> >> >>>>>> objects now. Locks are obtained as the application attempts to
> access
> >> >>>>>> those objects").
> >> >>>>>> - Using dynamic transactions can lead to dealocks.
> >> >>>>>>
> >> >>>>>> Given the changes in 9975, here's what I think the spec should
> say for
> >> >>>>>> now:
> >> >>>>>>
> >> >>>>>> - There can be multiple active static transactions, as long as
> their
> >> >>>>>> scopes do not overlap, or the overlapping objects are locked in
> modes
> >> >>>>>> that are not mutually exclusive.
> >> >>>>>> - [If we decide to keep dynamic transactions] There can be
> multiple
> >> >>>>>> active dynamic transactions. TODO: Decide what to do if they
> start
> >> >>>>>> overlapping:
> >> >>>>>>   -- proceed anyway and then fail at commit time in case of
> >> >>>>>> conflicts. However, I think this would require implementing MVCC,
> so
> >> >>>>>> implementations that use SQLite would be in trouble?
> >> >>>>>
> >> >>>>> Such implementations could just lock more conservatively (i.e. not
> allow
> >> >>>>> other transactions during a dynamic transaction).
> >> >>>>>
> >> >>>> Umm, I am not sure how useful dynamic transactions would be in that
> >> >>>> case...Ben Turner made the same comment earlier in the thread and I
> >> >>>> agree with him.
> >> >>>>
> >> >>>> Yes, dynamic transactions would not be useful on those
> implementations, but the point is that you could still implement the spec
> without a MVCC backend--though it >> would limit the concurrency that's
> possible.  Thus "implementations that use SQLite would" NOT necessarily "be
> in trouble".
> >> >>
> >> >> Interesting, I'm glad this conversation came up so we can sync up on
> assumptions...mine where:
> >> >> - There can be multiple transactions of any kind active against a
> given database session (see note below)
> >> >> - Multiple static transactions may overlap as long as they have
> compatible modes, which in practice means they are all READ_ONLY
> >> >> - Dynamic transactions have arbitrary granularity for scope
> (implementation specific, down to row-level locking/scope)
> >> >
> >> > Dynamic transactions should be able to lock as little as necessary and
> as late as required.
> >>
> >> So dynamic transactions, as defined in your proposal, didn't lock on a
> >> whole-objectStore level? If so, how does the author specify which rows
> >> are locked? And why is then openObjectStore a asynchronous operation
> >> that could possibly fail, since at the time when openObjectStore is
> >> called, the implementation doesn't know which rows are going to be
> >> accessed and so can't determine if a deadlock is occurring? And is it
> >> only possible to lock existing rows, or can you prevent new records
> >> from being created? And is it possible to only use read-locking for
> >> some rows, but write-locking for others, in the same objectStore?
>
> That's my interpretation, dynamic transactions don't lock whole object
> stores. To me dynamic transactions are the same as what typical SQL
> databases do today.
>
> The author doesn't explicitly specify which rows to lock. All rows that you
> "see" become locked (e.g. through get(), put(), scanning with a cursor,
> etc.). If you start the transaction as read-only then they'll all have
> shared locks. If you start the transaction as read-write then we can choose
> whether the implementation should always attempt to take exclusive locks or
> if it should take shared locks on read, and attempt to upgrade to an
> exclusive lock on first write (this affects failure modes a bit).
>
> Regarding deadlocks, that's right, the implementation cannot determine if a
> deadlock will occur ahead of time. Sophisticated implementations could track
> locks/owners and do deadlock detection, although a simple timeout-based
> mechanism is probably enough for IndexedDB.
>

Simple implementations will not deadlock because they're only doing object
store level locking in a constant locking order.
 Sophisticated implementations will be doing key level (IndexedDB's analog
to row level) locking with deadlock detection or using methods to completely
avoid it.  I'm not sure I'm comfortable with having one or two in-between
implementations relying on timeouts to resolve deadlocks.

Of course, if we're breaking deadlocks that means that web developers need
to handle this error case on every async request they make.  As such, I'd
rather that we require implementations to make deadlocks impossible.  This
means that they either need to be conservative about locking or to do MVCC
(or something similar) so that transactions can continue on even beyond the
point where we know they can't be serialized.  This would be consistent with
our usual policy of trying to put as much of the burden as is practical on
the browser developers rather than web developers.


> As for locking only existing rows, that depends on how much isolation we
> want to provide. If we want "serializable", then we'd have to put in things
> such as range locks and locks on non-existing keys so reads are consistent
> w.r.t. newly created rows.
>

For the record, I am completely against anything other than "serializable"
being the default.  Everything a web developer deals with follows run to
completion.  If you want to have optional modes that relax things in terms
of serializability, maybe we should start a new thread?


> I'm not sure why openObjectStore would need to be asynchronous in this
> context. In the past this was the case because metadata wasn't locked by the
> fact that you had an open database object, so openObjectStore involved I/O
> and possibly contentention against schema modification operations. Now that
> openObjectStore doesn't have to deal with contention (and implementations
> will probably cache the database catalog) there is no reason to make it
> async that I can think of.
>

Agreed.

J
Received on Friday, 23 July 2010 00:19:47 UTC