Re: [IndexedDB] Callback order from Jonas Sicking on 2010-07-07 (public-webapps@w3.org from July to September 2010)

From: Jonas Sicking <jonas@sicking.cc>
Date: Wed, 7 Jul 2010 15:54:03 -0700
To: Jeremy Orlow <jorlow@chromium.org>
Cc: Webapps WG <public-webapps@w3.org>
Message-ID: <AANLkTimVcN0-W_D6CRfxOFRVQY7uN5c_z6w2HO9xh9UJ@mail.gmail.com>
On Thu, Jun 24, 2010 at 4:40 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
> On Sat, Jun 19, 2010 at 9:12 AM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>> On Fri, Jun 18, 2010 at 7:46 PM, Jeremy Orlow <jorlow@chromium.org> wrote:
>> > On Fri, Jun 18, 2010 at 7:24 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>> >>
>> >> On Fri, Jun 18, 2010 at 7:01 PM, Jeremy Orlow <jorlow@chromium.org>
>> >> wrote:
>> >> > I think determinism is most important for the reasons you cited.  I
>> >> > think
>> >> > advanced, performance concerned apps could deal with either semantics
>> >> > you
>> >> > mentioned, so the key would be to pick whatever is best for the
>> >> > normal
>> >> > case.
>> >> >  I'm leaning towards thinking firing in order is the best way to go
>> >> > because
>> >> > it's the most intuitive/easiest to understand, but I don't feel
>> >> > strongly
>> >> > about anything other than being deterministic.
>> >>
>> >> I definitely agree that firing in request order is the simplest, both
>> >> from an implementation and usage point of view. However my concern is
>> >> that we'd lose most of the performance benefits that cursors provide
>> >> if we use that solution.
>> >>
>> >> What do you mean with "apps could deal with either semantics"? You
>> >> mean that they could deal with the cursor case by simply being slower,
>> >> or do you mean that they could work around the performance hit
>> >> somehow?
>> >
>> > Hm.  I was thinking they could save the value, call continue, then do
>> > work
>> > on it, but that'd of course only defer the slowdown for one iteration.
>> >  So I
>> > guess they'd have to store up a bunch of data and then make calls on it.
>>
>> Indeed which could be bad for memory footprint.
>>
>> > Of course, they'll run into all of these same issues with the sync API
>> > since
>> > things are of course done in order.  So maybe trying to optimize this
>> > specific case for just the async API is silly?
>>
>> I honestly haven't looked at the sync API. But yes, I assume that it
>> will in general have to serialize all calls into the database and thus
>> generally not be as performant. I don't think that is a good reason to
>> make the async API slower too though.
>>
>> But it's entirely possible that I'm overly concerned about cursor
>> performance in general though. I won't argue too strongly that we need
>> to prioritize cursor callback events until I've seen some numbers. If
>> we want to simply define that callbacks fire in request order for now
>> then that is fine with me.
>
> Yeah, I think we should get some hard numbers and think carefully about this
> before we make things even more complicated/nuanced.

I ran some tests. Note that the test implementation is an
approximation. It's both somewhat optimistic in that it doesn't make
the extra effort to ensure that cursor callbacks always run before
other callbacks. But it's also somewhat pessimistic in that it always
returns to the main event loop, even though that is often not needed.
My guess is that in the end it's a pretty close approximation
performance wise.

I've attached the testcase I used in case anyone want to play around
with it. It contains a fair amount of mozilla specific features
(generators are awesome for asynchronous callbacks) as well as is
written to the IndexedDB API that we currently have implemented, but
it should be portable to other browsers.

For the currently proposed solution, of always running requests in the
order they are made, including requests coming from cursor.continue(),
gives the following results:

Plain iteration over 10000 entries using cursor: 2400ms
Iteration over 10000 entries using cursor, performing a join by for
each iteration call getAll on an index: 5400ms

For the proposed solution of prioritizing cursor.continue() callbacks
over other callbacks:

Plain iteration over 10000 entries using cursor: 1050ms
Iteration over 10000 entries using cursor, performing a join by for
each iteration call getAll on an index: 1280ms

The reason that just plain iteration got faster is that we implemented
the strict ordering by sending all requests to the thread the database
runs on, and then having the database thread process all requests in
order and send them back to the requesting thread. So for plain
iteration it basically just means a roundtrip to the indexedDB thread
and back.

Based on these numbers, I think we should prioritize
IDBCursor.continue() callbacks as for join example this results in a
over 4x speedup.

/ Jonas
Attachments

text/html attachment: test.html
Received on Wednesday, 7 July 2010 22:54:55 UTC