Re: [IndexedDB] Current editor's draft from Jeremy Orlow on 2010-07-14 (public-webapps@w3.org from July to September 2010)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Wed, 14 Jul 2010 13:25:36 +0100
To: Andrei Popescu <andreip@google.com>
Cc: Pablo Castro <Pablo.Castro@microsoft.com>, Nikunj Mehta <nikunj@o-micron.com>, Jonas Sicking <jonas@sicking.cc>, public-webapps <public-webapps@w3.org>
Message-ID: <AANLkTin2dm72_UVYKeOEkfpoAjGwq6EVEglQz71R6LVr@mail.gmail.com>
On Wed, Jul 14, 2010 at 1:20 PM, Andrei Popescu <andreip@google.com> wrote:

> Hi,
>
> I would like to propose that we update the current spec to reflect all
> the changes we have agreement on. We can then iteratively review and
> make edits as soon as the remaining issues are solved.  Concretely, I
> would like to check in a fix for
>
> http://www.w3.org/Bugs/Public/show_bug.cgi?id=9975
>
> with the following two exceptions which, based on the feedback in this
> thread, require more discussion:
>
> - leave in support for dynamic transactions but add a separate API for
> it, as suggested by Jonas earlier in this thread.
> - leave in the explicit transaction commit
> - leave in nested transactions
>
> The changes in 9975 have been debated for more than two month now, so
> I feel it's about time to update the specification so that it's in
> line with what we're actually discussing.
>

Agreed.  In the future I think we should never let things stay this out of
sync for this long, but I understand how this was a bit of a special case
because of the scope of the changes.  But yeah, let's make these changes and
then iterate.  And hopefully we can resolve the dynamic transaction,
explicit commit, and nested transaction issues in the near future.


> Thanks,
> Andrei
>
> On Wed, Jul 14, 2010 at 8:10 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
> > On Wed, Jul 14, 2010 at 3:52 AM, Pablo Castro <
> Pablo.Castro@microsoft.com>
> > wrote:
> >>
> >> From: public-webapps-request@w3.org [mailto:
> public-webapps-request@w3.org]
> >> On Behalf Of Andrei Popescu
> >> Sent: Monday, July 12, 2010 5:23 AM
> >>
> >> Sorry I disappeared for a while. Catching up with this discussion was an
> >> interesting exercise...
> >
> > Yes, Indeed.  :-)
> >
> >>
> >> there is no particular message in this thread I can respond to, so I
> >> thought I'd just reply to the last one.
> >
> > Probably a good idea.  I was trying to respond hixie style--which is
> harder
> > than it looks on stuff like this.
> >
> >>
> >> Overall I think the new proposal is shaping up well and is being
> effective
> >> in simplifying scenarios. I do have a few suggestions and questions for
> >> things I'm not sure I see all the way.
> >>
> >> READ_ONLY vs READ_WRITE as defaults for transactions:
> >> To be perfectly honest, I think this discussion went really deep over an
> >> issue that won't be a huge deal for most people. My perspective, trying
> to
> >> avoid performance or usage frequency speculation, is around what's
> easier to
> >> detect. Concurrency issues are hard to see. On the other hand, whenever
> we
> >> can throw an exception and give explicit guidance that unblocks people
> right
> >> away. For this case I suspect it's best to default to READ_ONLY, because
> if
> >> someone doesn't read or think about it and just uses the stuff and tries
> to
> >> change something they'll get a clear error message saying "if you want
> to
> >> change stuff, use READ_WRITE please". The error is not data- or
> >> context-dependent, so it'll fail on first try at most once per developer
> and
> >> once they fix it they'll know for all future cases.
> >
> > Couldn't have said it better myself.
> >
> >>
> >> Dynamic transactions:
> >> I see that most folks would like to see these going away. While I like
> the
> >> predictability and simplifications that we're able to make by using
> static
> >> scopes for transactions, I worry that we'll close the door for two
> >> scenarios: background tasks and query processors. Background tasks such
> as
> >> synchronization and post-processing of content would seem to be almost
> >> impossible with the static scope approach, mostly due to the granularity
> of
> >> the scope specification (whole stores). Are we okay with saying that you
> >> can't for example sync something in the background (e.g. in a worker)
> while
> >> your app is still working? Am I missing something that would enable this
> >> class of scenarios? Query processors are also tricky because you usually
> >> take the query specification in some form after the transaction started
> >> (especially if you want to execute multiple queries with later queries
> >> depending on the outcome of the previous ones). The background tasks
> issue
> >> in particular looks pretty painful to me if we don't have a way to
> achieve
> >> it without freezing the application while it happens.
> >
> > Well, the application should never freeze in terms of the UI locking up,
> but
> > in what you described I could see it taking a while for data to show up
> on
> > the screen.  This is something that can be fixed by doing smaller updates
> on
> > the background thread, sending a message to the background thread that it
> > should abort for now, doing all database access on the background thread,
> > etc.
> > One point that I never saw made in the thread that I think is really
> > important is that dynamic transactions can make concurrency worse in some
> > cases.  For example, with dynamic transactions you can get into live-lock
> > situations.  Also, using Pablo's example, you could easily get into a
> > situation where the long running transaction on the worker keeps hitting
> > serialization issues and thus it's never able to make progress.
> > I do see that there are use cases where having dynamic transactions would
> be
> > much nicer, but the amount of non-determinism they add (including to
> > performance) has me pretty worried.  I pretty firmly believe we should
> look
> > into adding them in v2 and remove them for now.  If we do leave them in,
> it
> > should definitely be in its own method to make it quite clear that the
> > semantics are more complex.
> >
> >>
> >> Implicit commit:
> >> Does this really work? I need to play with sample app code more, it may
> >> just be that I'm old-fashioned. For example, if I'm downloading a bunch
> of
> >> data form somewhere and pushing rows into the store within a
> transaction,
> >> wouldn't it be reasonable to do the whole thing in a transaction? In
> that
> >> case I'm likely to have to unwind while I wait for the next callback
> from
> >> XmlHttpRequest with the next chunk of data. I understand that avoiding
> it
> >> results in nicer patterns (e.g.
> db.objectStores("foo").get(123).onsuccess =
> >> ...), but in practice I'm not sure if that will hold given that you
> still
> >> need error callbacks and such.
> >
> > I believe your example of doing XHRs in the middle of a transaction is
> > something we were explicitly trying to avoid making possible.  In this
> case,
> > you should do all of your XHRs first and then do your transaction.  If
> you
> > need to read form the ObjectStore, do a XHR, and then write to the
> > ObjectStore, you can implement it with 2 transactions and have the second
> > one verify the data has not changed before doing the actual work.
> > Allowing things like XHRs in the middle of an operation will encourage
> > really long running transactions that will be really bad for concurrency
> and
> > make the transaction system much less elegant than it currently is.
> >
> >
> >>
> >> Nested transactions:
> >> Not sure why we're considering this an advanced scenario. To be clear
> >> about what the feature means to me: make it legal to start a transaction
> >> when one is already in progress, and the nested one is effectively a
> no-op,
> >> just refcounts the transaction, so you need equal amounts of commit()'s,
> >> implicit or explicit, and an abort() cancels all nested transactions.
> The
> >> purpose of this is to allow composition, where a piece of code that
> needs a
> >> transaction can start one locally, independently of whether the caller
> had
> >> already one going.
> >
> > I believe it's actually a bit more tricky than what you said.  For
> example,
> > if we only support static transactions, will we require that any nested
> > transaction only request a subset of the locks the outer one took?  What
> if
> > we try to start a dynamic transaction inside of a static one?  Etc.  But
> I
> > agree it's not _that_ tricky and I'm also not convinced it's an
> "advanced"
> > feature.
> > I'd suggest we take it out for now and look at re-adding it when the
> basics
> > of the async API are more solidified.  I hope we can get it into v1, but
> we
> > have too much in the air right now as is.
> >
> >> Schema versioning:
> >> It's unfortunate that we need to have explicit elements in the page for
> >> the versioning protocol to work, but the fact that we can have a
> reliable
> >> mechanism for pages to coordinate a version bump is really nice. For
> folks
> >> that don't know about this the first time they build it, an explicit
> error
> >> message on the schema change timeout can explain where to start. I do
> think
> >> that there may be a need for non-breaking changes to the schema to
> happen
> >> without a "version dance". For example, query processors regularly
> create
> >> temporary tables during sorts and such. Those shouldn't require any
> >> coordination (maybe we allow non-versioned additions, or we just
> introduce
> >> temporary, unnamed tables that evaporate on commit() or database
> >> close()...).
> >
> > I agree we should have a way to do non-beaking changes to the schema at
> some
> > point, but I believe it can wait till v2 at this point.  Temporary
> > objectStores seems to be the leading reason why people want this now, so
> > maybe we should consider adding them to the spec now.  That said, I'm
> still
> > not convinced that there are many use cases where one needs them.
> >  Everything you can do with a temporary objectStore you should be able to
> do
> > in memory as well.  And thus the only reason to add them is if we're
> handing
> > enough data that some will spill to disk.  And I'm not convinced this
> will
> > be a very mainstream scenario.  Especially since one should be able to do
> > merge joins in many cases.
> > I feel strongly that what Jonas has proposed is what we should do for v1.
>  I
> > think he's explained the reasoning behind the API pretty well in the
> thread.
> >
> > Other points:
> > *_NO_DUPLICATES:
> > I'm still not convinced we need this in v1.  It will help performance in
> > some cases, but it adds more API surface area than immediately meets the
> > eye.  If we do decide to have it in v1, we need to resolve the issues
> Jonas
> > brought up.  Ideally we would do this on the thread Jonas started
> > ("[IndexedDB] .value of no-duplicate cursors").
> > Pre-loaded cursors + getAll:
> > I'm glad we've decided to take these out for the time being.
> > J
>
Received on Wednesday, 14 July 2010 12:26:26 UTC