Re: [IndexedDB] Current editor's draft from Jeremy Orlow on 2010-07-24 (public-webapps@w3.org from July to September 2010)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Sat, 24 Jul 2010 11:29:19 -0400
To: Jonas Sicking <jonas@sicking.cc>
Cc: Nikunj Mehta <nikunj@o-micron.com>, Pablo Castro <Pablo.Castro@microsoft.com>, Andrei Popescu <andreip@google.com>, public-webapps <public-webapps@w3.org>
Message-ID: <AANLkTimV-UmGatrrGtcEZuLc1y6eKzZU4qvL3KtLE5e4@mail.gmail.com>
On Fri, Jul 23, 2010 at 4:21 PM, Jonas Sicking <jonas@sicking.cc> wrote:

> On Fri, Jul 23, 2010 at 8:09 AM, Nikunj Mehta <nikunj@o-micron.com> wrote:
> >
> > On Jul 22, 2010, at 11:27 AM, Jonas Sicking wrote:
> >
> >> On Thu, Jul 22, 2010 at 3:43 AM, Nikunj Mehta <nikunj@o-micron.com>
> wrote:
> >>>
> >>> On Jul 16, 2010, at 5:41 AM, Pablo Castro wrote:
> >>>
> >>>>
> >>>> From: jorlow@google.com [mailto:jorlow@google.com] On Behalf Of
> Jeremy Orlow
> >>>> Sent: Thursday, July 15, 2010 8:41 AM
> >>>>
> >>>> On Thu, Jul 15, 2010 at 4:30 PM, Andrei Popescu <andreip@google.com>
> wrote:
> >>>> On Thu, Jul 15, 2010 at 3:24 PM, Jeremy Orlow <jorlow@chromium.org>
> wrote:
> >>>>> On Thu, Jul 15, 2010 at 3:09 PM, Andrei Popescu <andreip@google.com>
> wrote:
> >>>>>>
> >>>>>> On Thu, Jul 15, 2010 at 9:50 AM, Jeremy Orlow <jorlow@chromium.org>
> wrote:
> >>>>>>>>>> Nikunj, could you clarify how locking works for the dynamic
> >>>>>>>>>> transactions proposal that is in the spec draft right now?
> >>>>>>>>>
> >>>>>>>>> I'd definitely like to hear what Nikunj originally intended here.
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>> Hmm, after re-reading the current spec, my understanding is that:
> >>>>>>>>
> >>>>>>>> - Scope consists in a set of object stores that the transaction
> operates
> >>>>>>>> on.
> >>>>>>>> - A connection may have zero or one active transactions.
> >>>>>>>> - There may not be any overlap among the scopes of all active
> >>>>>>>> transactions (static or dynamic) in a given database. So you
> cannot
> >>>>>>>> have two READ_ONLY static transactions operating simultaneously
> over
> >>>>>>>> the same object store.
> >>>>>>>> - The granularity of locking for dynamic transactions is not
> specified
> >>>>>>>> (all the spec says about this is "do not acquire locks on any
> database
> >>>>>>>> objects now. Locks are obtained as the application attempts to
> access
> >>>>>>>> those objects").
> >>>>>>>> - Using dynamic transactions can lead to dealocks.
> >>>>>>>>
> >>>>>>>> Given the changes in 9975, here's what I think the spec should say
> for
> >>>>>>>> now:
> >>>>>>>>
> >>>>>>>> - There can be multiple active static transactions, as long as
> their
> >>>>>>>> scopes do not overlap, or the overlapping objects are locked in
> modes
> >>>>>>>> that are not mutually exclusive.
> >>>>>>>> - [If we decide to keep dynamic transactions] There can be
> multiple
> >>>>>>>> active dynamic transactions. TODO: Decide what to do if they start
> >>>>>>>> overlapping:
> >>>>>>>>   -- proceed anyway and then fail at commit time in case of
> >>>>>>>> conflicts. However, I think this would require implementing MVCC,
> so
> >>>>>>>> implementations that use SQLite would be in trouble?
> >>>>>>>
> >>>>>>> Such implementations could just lock more conservatively (i.e. not
> allow
> >>>>>>> other transactions during a dynamic transaction).
> >>>>>>>
> >>>>>> Umm, I am not sure how useful dynamic transactions would be in that
> >>>>>> case...Ben Turner made the same comment earlier in the thread and I
> >>>>>> agree with him.
> >>>>>>
> >>>>>> Yes, dynamic transactions would not be useful on those
> implementations, but the point is that you could still implement the spec
> without a MVCC backend--though it would limit the concurrency that's
> possible.  Thus "implementations that use SQLite would" NOT necessarily "be
> in trouble".
> >>>>
> >>>> Interesting, I'm glad this conversation came up so we can sync up on
> assumptions...mine where:
> >>>> - There can be multiple transactions of any kind active against a
> given database session (see note below)
> >>>> - Multiple static transactions may overlap as long as they have
> compatible modes, which in practice means they are all READ_ONLY
> >>>> - Dynamic transactions have arbitrary granularity for scope
> (implementation specific, down to row-level locking/scope)
> >>>
> >>> Dynamic transactions should be able to lock as little as necessary and
> as late as required.
> >>
> >> So dynamic transactions, as defined in your proposal, didn't lock on a
> >> whole-objectStore level?
> >
> > That is not correct. I said that the original intention was to make
> dynamic transactions lock as little and as late as possible. However, the
> current spec does not state explicitly that dynamic transactions should not
> lock the entire objectStore, but it could.
> >
> >> If so, how does the author specify which rows
> >> are locked?
> >
> > Again, the intention is to do this directly from the actions performed by
> the application and the affected keys
>
> The two above statements confuse me.
>
> The important question is: Pablo is clearly suggesting that dynamic
> transactions should not use whole-objectStore locks, but rather
> row-level locks, or possibly range locks. Is this what you are
> suggesting too?
>

I'm not sure why this matters at all.  The point is that we need to use some
locking scheme that ensures serializability and ideally it'd be done in a
way that allows a good deal of concurrency.  I don't think the spec should
force any particular locking scheme (or pessimistic transactions in
general).


> >> And why is then openObjectStore a asynchronous operation
> >> that could possibly fail, since at the time when openObjectStore is
> >> called, the implementation doesn't know which rows are going to be
> >> accessed and so can't determine if a deadlock is occurring?
> >
> > The open call is used to check if some static transaction has the entire
> store locked for READ_WRITE. If so, the open call will block.
>

Nikunj, as has been explained several times, even if the call is synchronous
and the entire object store is locked, there's no reason you'd need to block
here.  You can just wait until the first call that touches data (which are,
all async) to wait for the lock to be released.  Therefore there's no reason
for it to be sync.


> >> And is it
> >> only possible to lock existing rows, or can you prevent new records
> >> from being created?
> >
> > There's no way to lock yet to be created rows since until a transaction
> ends, its effects cannot be made visible to other transactions.
>
> So if you have an objectStore with auto-incrementing indexes, there is
> the possibility that two dynamic transactions both can add a row to
> said objectStore at the same time. Both transactions would then add a
> row with the same autogenerated id (one higher than the highest id in
> the table). Upon commit, how is this conflict resolved?
>
> What if the objectStore didn't use auto-incrementing indexes, but you
> still had two separate dynamic transactions which both insert a row
> with the same key. How is the conflict resolved?
>

I believe a common trick to reconcile this is stipulating that if you add
1000 "rows" the id's may not necessarily be 1000 sequential numbers.  This
allows transactions to increment the id and leave it incremented even if the
transaction fails.  Which also means that other transactions can be grabbing
an ID of their own as well.  And if a transaction fails, well, we've wasted
one possible ID.


> >> And is it possible to only use read-locking for
> >> some rows, but write-locking for others, in the same objectStore?
> >
> > An implementation could use shared locks for read operations even though
> the object store might have been opened in READ_WRITE mode, and later
> upgrade the locks if the read data is being modified. However, I am not keen
> to push for this as a specced behavior.
>
> What do you mean by "an implementation could"? Is this left
> intentionally undefined and left up to the implementation? Doesn't
> that mean that there is significant risk that code could work very
> well in a conservative implementation, but often cause race conditions
> in a implementation that uses narrower locks? Wouldn't this result in
> a "race to the bottom" where implementations are forced to eventually
> use very wide locks in order to work well in websites?
>
> In general, there are a lot of details that are unclear in the dynamic
> transactions proposals. I'm also not sure if these things are unclear
> to me because they are intentionally left undefined, or if you guys
> just haven't had time yet to define the details?
>
> As the spec is now, as an implementor I'd have no idea of how to
> implement dynamic transactions. And as a user I'd have no idea what
> level of protection to expect from implementations, nor what
> strategies to use to avoid bugs.
>
> In all the development I've done deadlocks and race conditions are
> generally unacceptable, and instead strategies are developed that
> avoids them, such as always grab locks in the same order, and always
> grab locks when using shared data. I currently have no idea what
> strategy to recommend in IndexedDB documentation to developers to
> allow them to avoid race conditions and deadlocks.
>
> To get clarity in these questions, I'd *really* *really* like to see a
> more detailed proposal.
>

I think a detailed proposal would be a good thing (maybe from Pablo or
Nikunj since they're who are really pushing them at this point), but at the
same time, I think you're getting really bogged down in the details, Jonas.

What we should be concerned about and speccing is the behavior the user
sees.  For example, can any operation on data fail due to transient issues
(like deadlocks, serialization issues) or will the implementation shield web
developers from this?  And will we guarantee 100% serializable semantics?
 (I strongly believe we should on both counts.)  How things are implemented,
granularity of locks, or even if an implementation uses locks at all for
dynamic transactions should be explicitly out of scope for any spec.  After
all, it's only the behavior users care about.

J
Received on Saturday, 24 July 2010 15:30:16 UTC