Re: [IndexedDB] IDBCursor.update for cursors returned from IDBIndex.openCursor from Jonas Sicking on 2010-09-17 (public-webapps@w3.org from July to September 2010)

From: Jonas Sicking <jonas@sicking.cc>
Date: Thu, 16 Sep 2010 17:06:58 -0700
To: Jeremy Orlow <jorlow@chromium.org>
Cc: public-webapps WG <public-webapps@w3.org>
Message-ID: <AANLkTinL8WdCHwAM1YzAaUNVPOzjneFJ-Zhy3mg6-9NG@mail.gmail.com>
On Thu, Sep 16, 2010 at 2:23 PM, Jeremy Orlow <jorlow@chromium.org> wrote:
> On Thu, Sep 16, 2010 at 8:53 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>> On Thu, Sep 16, 2010 at 2:15 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
>> > Wait a sec.  What are the use cases for non-object cursors anyway?  They
>> > made perfect sense back when we allowed explicit index management, but
>> > now
>> > they kind of seem like a premature optimization or possibly even dead
>> > weight.  Maybe we should just remove them altogether?
>>
>> They are still useful for joins. Consider an objectStore "employees":
>>
>> { id: 1, name: "Sven", employed: "1-1-2010" }
>> { id: 2, name: "Bert", employed: "5-1-2009" }
>> { id: 3, name: "Adam", employed: "6-6-2008" }
>> And objectStore "sales"
>>
>> { seller: 1, candyName: "lollipop", quantity: 5, date: "9-15-2010" }
>> { seller: 1, candyName: "swedish fish", quantity: 12, date: "9-15-2010" }
>> { seller: 2, candyName: "jelly belly", quantity: 3, date: "9-14-2010" }
>> { seller: 3, candyName: "heath bar", quantity: 3, date: "9-13-2010" }
>> If you want to display the amount of sales per person, sorted by names
>> of sales person, you could do this by first creating and index for
>> "employees" with keyPath "name". You'd then use IDBIndex.openCursor to
>> iterate that index, and for each entry find all entries in the "sales"
>> objectStore where "seller" matches the cursors .value.
>>
>> So in this case you don't actually need any data from the "employees"
>> objectStore, all the data is available in the index. Thus it is
>> sufficient, and faster, to use openCursor than openObjectCursor.
>>
>> In general, it's a common optimization to stick enough data in an
>> index that you don't have to actually look up in the objectStore
>> itself. This is slightly less commonly doable since we have relatively
>> simple indexes so far. But still doable as the example above shows.
>> Once we add support for arrays as keys this will be much more common
>> as you can then stick arbitrary data into the index by simply adding
>> additional entries to all key arrays. And even more so once we
>> (probably in a future version) add support for computed indexes.
>
>
> On Thu, Sep 16, 2010 at 8:57 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>> On Thu, Sep 16, 2010 at 4:08 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
>> > Actually, for that matter, are remove and update needed at all?  I think
>> > they may just be more cruft left over from the explicit index days.  As
>> > far
>> > as I can tell, any .delete or .remove should be doable via an
>> > objectCursor +
>> > .puts/.removes on the objectStore.
>>
>> They are not strictly needed, but they are a decent convinence
>> feature, and with a proper implementation they can even be a
>> performance optimization. With a cursor iterating a b-tree you can let
>> the cursor keep a pointer to the b-tree entry. They way .delete and
>> .update doesn't have to do a b-tree lookup at all.
>>
>> We're currently not able to do this since our backend (sqlite) doesn't
>> have good enough cursor support, but I suspect that this will change
>> at some point in the future. In the mean time it seems like a good
>> thing to allow people to use API that will be faster in the future.
>
> All your arguments revolve around what the spec and implementations might do
> in the future.

I disagree. The IDBIndex.openCursor example I included uses only
existing API, and is a performance improvement in at least our current
implementation. Would be interested to hear if it's not a performance
improvement in others.

> Typically we add API surface area only for use cases that
> are currently impossible to satisfy or proven performance bottlenecks. I
> agree that it's likely implementations will want to do optimizations like
> this in the future, but until they do, it'll be hard to really understand
> the implications and complications that might arrise.

That's not entirely true. All the databases I have worked with have
had significant performance degradations when having to look up the
main table contents rather than simply looking at the contents in the
index. I doubt that we'll be able to create a backend where that is
not true. So I think we should assume that object cursors are slower
than plain cursors.

Further, I think we should get users on APIs that we are likely to
implement with a higher performance. For example, I think sqlite
doesn't support having multiple write transactions to the same
database, even if those are to different tables. Thus the whole API of
specifying which objectStores you want to include in a transaction is
purely for future optimizations in at least implementations backed by
sqlite.

I especially think these APIs are worth it given that it's low cost to
implement, and adds convenience value to users even if implementations
aren't faster yet.

/ Jonas
Received on Friday, 17 September 2010 00:07:53 UTC