- From: Jeremy Orlow <jorlow@chromium.org>
- Date: Wed, 14 Jul 2010 17:20:59 +0100
- To: Jonas Sicking <jonas@sicking.cc>
- Cc: Webapps WG <public-webapps@w3.org>
- Message-ID: <AANLkTiksbTcUAgkdaR0i4xOMI37IjBP947kjoK3Fh4cS@mail.gmail.com>
On Wed, Jul 14, 2010 at 5:15 PM, Jonas Sicking <jonas@sicking.cc> wrote: > On Wed, Jul 14, 2010 at 4:16 AM, Jeremy Orlow <jorlow@chromium.org> wrote: > > On Wed, Jul 7, 2010 at 11:54 PM, Jonas Sicking <jonas@sicking.cc> wrote: > >> > >> On Thu, Jun 24, 2010 at 4:40 AM, Jeremy Orlow <jorlow@chromium.org> > wrote: > >> > On Sat, Jun 19, 2010 at 9:12 AM, Jonas Sicking <jonas@sicking.cc> > wrote: > >> >> > >> >> On Fri, Jun 18, 2010 at 7:46 PM, Jeremy Orlow <jorlow@chromium.org> > >> >> wrote: > >> >> > On Fri, Jun 18, 2010 at 7:24 PM, Jonas Sicking <jonas@sicking.cc> > >> >> > wrote: > >> >> >> > >> >> >> On Fri, Jun 18, 2010 at 7:01 PM, Jeremy Orlow < > jorlow@chromium.org> > >> >> >> wrote: > >> >> >> > I think determinism is most important for the reasons you cited. > >> >> >> > I > >> >> >> > think > >> >> >> > advanced, performance concerned apps could deal with either > >> >> >> > semantics > >> >> >> > you > >> >> >> > mentioned, so the key would be to pick whatever is best for the > >> >> >> > normal > >> >> >> > case. > >> >> >> > I'm leaning towards thinking firing in order is the best way to > >> >> >> > go > >> >> >> > because > >> >> >> > it's the most intuitive/easiest to understand, but I don't feel > >> >> >> > strongly > >> >> >> > about anything other than being deterministic. > >> >> >> > >> >> >> I definitely agree that firing in request order is the simplest, > >> >> >> both > >> >> >> from an implementation and usage point of view. However my concern > >> >> >> is > >> >> >> that we'd lose most of the performance benefits that cursors > provide > >> >> >> if we use that solution. > >> >> >> > >> >> >> What do you mean with "apps could deal with either semantics"? You > >> >> >> mean that they could deal with the cursor case by simply being > >> >> >> slower, > >> >> >> or do you mean that they could work around the performance hit > >> >> >> somehow? > >> >> > > >> >> > Hm. I was thinking they could save the value, call continue, then > do > >> >> > work > >> >> > on it, but that'd of course only defer the slowdown for one > >> >> > iteration. > >> >> > So I > >> >> > guess they'd have to store up a bunch of data and then make calls > on > >> >> > it. > >> >> > >> >> Indeed which could be bad for memory footprint. > >> >> > >> >> > Of course, they'll run into all of these same issues with the sync > >> >> > API > >> >> > since > >> >> > things are of course done in order. So maybe trying to optimize > this > >> >> > specific case for just the async API is silly? > >> >> > >> >> I honestly haven't looked at the sync API. But yes, I assume that it > >> >> will in general have to serialize all calls into the database and > thus > >> >> generally not be as performant. I don't think that is a good reason > to > >> >> make the async API slower too though. > >> >> > >> >> But it's entirely possible that I'm overly concerned about cursor > >> >> performance in general though. I won't argue too strongly that we > need > >> >> to prioritize cursor callback events until I've seen some numbers. If > >> >> we want to simply define that callbacks fire in request order for now > >> >> then that is fine with me. > >> > > >> > Yeah, I think we should get some hard numbers and think carefully > about > >> > this > >> > before we make things even more complicated/nuanced. > >> > >> I ran some tests. Note that the test implementation is an > >> approximation. It's both somewhat optimistic in that it doesn't make > >> the extra effort to ensure that cursor callbacks always run before > >> other callbacks. But it's also somewhat pessimistic in that it always > >> returns to the main event loop, even though that is often not needed. > >> My guess is that in the end it's a pretty close approximation > >> performance wise. > >> > >> I've attached the testcase I used in case anyone want to play around > >> with it. It contains a fair amount of mozilla specific features > >> (generators are awesome for asynchronous callbacks) as well as is > >> written to the IndexedDB API that we currently have implemented, but > >> it should be portable to other browsers. > >> > >> For the currently proposed solution, of always running requests in the > >> order they are made, including requests coming from cursor.continue(), > >> gives the following results: > >> > >> Plain iteration over 10000 entries using cursor: 2400ms > >> Iteration over 10000 entries using cursor, performing a join by for > >> each iteration call getAll on an index: 5400ms > >> > >> For the proposed solution of prioritizing cursor.continue() callbacks > >> over other callbacks: > >> > >> Plain iteration over 10000 entries using cursor: 1050ms > >> Iteration over 10000 entries using cursor, performing a join by for > >> each iteration call getAll on an index: 1280ms > >> > >> The reason that just plain iteration got faster is that we implemented > >> the strict ordering by sending all requests to the thread the database > >> runs on, and then having the database thread process all requests in > >> order and send them back to the requesting thread. So for plain > >> iteration it basically just means a roundtrip to the indexedDB thread > >> and back. > >> > >> Based on these numbers, I think we should prioritize > >> IDBCursor.continue() callbacks as for join example this results in a > >> over 4x speedup. > > > > I would like to note that this speedup is on one particular > implementation > > which isn't particularly optimized. Nevertheless, that is a pretty > > substantial difference in run times. But yet it just pains me to think > of > > special casing the order of execution for just cursors. Especially when > > we're still trying to nail down the very basics of the async API. > > I would prefer to open a bug and leave this on the backburner for a while > > (like other features like nested transactions). When we do look at this, > we > > may want to consider making it an option to run in this mode rather than > > being the default. Is that OK with you? If so, we can open a bug to > track > > this but mention in the bug that we're going to hold off for a bit. > > My biggest take away from all of this is that generators seem cool. :-) > > If you're concerned that this speedup only applies to one particular > implementation, I'd encourage you to get numbers from other > implementations ;-) > > There is reason to believe that the speedup could be even bigger in a > multi-process implementation such as the one I imagine that chrome > requires, since you're serializing cross-process calls rather than the > cross-thread calls that Firefox is using. > > I'd rather not leave this indefinitely open on the backburner as it's > something that we need to decide one way or another. But if you need > time to research performance effects then that is of course ok. > My entire concern at this point is too much up in the air at once in the spec and creating complex behaviors that aren't intuitive to developers. I hope the former will go away in the next couple of weeks (depends on how fast we can come to decisions and implement them in the spec). The latter I don't have a good answer for. Time to research perf effects is not one of my concerns. (I think you're right that we'll see even more of a chance due to multi-process latency.) J
Received on Wednesday, 14 July 2010 16:21:50 UTC