Re: [IndexedDB] IDBCursor.update for cursors returned from IDBIndex.openCursor from Jeremy Orlow on 2010-09-17 (public-webapps@w3.org from July to September 2010)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Fri, 17 Sep 2010 10:46:26 +0100
To: Jonas Sicking <jonas@sicking.cc>
Cc: public-webapps WG <public-webapps@w3.org>
Message-ID: <AANLkTi=3dFRtFjFmgWwOkNxPKXN_TQ8=kxJ2Bjqb-UFN@mail.gmail.com>
On Fri, Sep 17, 2010 at 1:06 AM, Jonas Sicking <jonas@sicking.cc> wrote:

> On Thu, Sep 16, 2010 at 2:23 PM, Jeremy Orlow <jorlow@chromium.org> wrote:
> > On Thu, Sep 16, 2010 at 8:53 PM, Jonas Sicking <jonas@sicking.cc> wrote:
> >>
> >> On Thu, Sep 16, 2010 at 2:15 AM, Jeremy Orlow <jorlow@chromium.org>
> wrote:
> >> > Wait a sec.  What are the use cases for non-object cursors anyway?
>  They
> >> > made perfect sense back when we allowed explicit index management, but
> >> > now
> >> > they kind of seem like a premature optimization or possibly even dead
> >> > weight.  Maybe we should just remove them altogether?
> >>
> >> They are still useful for joins. Consider an objectStore "employees":
> >>
> >> { id: 1, name: "Sven", employed: "1-1-2010" }
> >> { id: 2, name: "Bert", employed: "5-1-2009" }
> >> { id: 3, name: "Adam", employed: "6-6-2008" }
> >> And objectStore "sales"
> >>
> >> { seller: 1, candyName: "lollipop", quantity: 5, date: "9-15-2010" }
> >> { seller: 1, candyName: "swedish fish", quantity: 12, date: "9-15-2010"
> }
> >> { seller: 2, candyName: "jelly belly", quantity: 3, date: "9-14-2010" }
> >> { seller: 3, candyName: "heath bar", quantity: 3, date: "9-13-2010" }
> >> If you want to display the amount of sales per person, sorted by names
> >> of sales person, you could do this by first creating and index for
> >> "employees" with keyPath "name". You'd then use IDBIndex.openCursor to
> >> iterate that index, and for each entry find all entries in the "sales"
> >> objectStore where "seller" matches the cursors .value.
> >>
> >> So in this case you don't actually need any data from the "employees"
> >> objectStore, all the data is available in the index. Thus it is
> >> sufficient, and faster, to use openCursor than openObjectCursor.
> >>
> >> In general, it's a common optimization to stick enough data in an
> >> index that you don't have to actually look up in the objectStore
> >> itself. This is slightly less commonly doable since we have relatively
> >> simple indexes so far. But still doable as the example above shows.
> >> Once we add support for arrays as keys this will be much more common
> >> as you can then stick arbitrary data into the index by simply adding
> >> additional entries to all key arrays. And even more so once we
> >> (probably in a future version) add support for computed indexes.
> >
> >
> > On Thu, Sep 16, 2010 at 8:57 PM, Jonas Sicking <jonas@sicking.cc> wrote:
> >>
> >> On Thu, Sep 16, 2010 at 4:08 AM, Jeremy Orlow <jorlow@chromium.org>
> wrote:
> >> > Actually, for that matter, are remove and update needed at all?  I
> think
> >> > they may just be more cruft left over from the explicit index days.
>  As
> >> > far
> >> > as I can tell, any .delete or .remove should be doable via an
> >> > objectCursor +
> >> > .puts/.removes on the objectStore.
> >>
> >> They are not strictly needed, but they are a decent convinence
> >> feature, and with a proper implementation they can even be a
> >> performance optimization. With a cursor iterating a b-tree you can let
> >> the cursor keep a pointer to the b-tree entry. They way .delete and
> >> .update doesn't have to do a b-tree lookup at all.
> >>
> >> We're currently not able to do this since our backend (sqlite) doesn't
> >> have good enough cursor support, but I suspect that this will change
> >> at some point in the future. In the mean time it seems like a good
> >> thing to allow people to use API that will be faster in the future.
> >
> > All your arguments revolve around what the spec and implementations might
> do
> > in the future.
>
> I disagree. The IDBIndex.openCursor example I included uses only
> existing API, and is a performance improvement in at least our current
> implementation. Would be interested to hear if it's not a performance
> improvement in others.
>

It's not in ours because we join to the ObjectStore's data table either way.
 But that's not at all why I'm bringing this up.


>  > Typically we add API surface area only for use cases that
> > are currently impossible to satisfy or proven performance bottlenecks. I
> > agree that it's likely implementations will want to do optimizations like
> > this in the future, but until they do, it'll be hard to really understand
> > the implications and complications that might arrise.
>
> That's not entirely true. All the databases I have worked with have
> had significant performance degradations when having to look up the
> main table contents rather than simply looking at the contents in the
> index. I doubt that we'll be able to create a backend where that is
> not true. So I think we should assume that object cursors are slower
> than plain cursors.
>

I agree this is true.

Further, I think we should get users on APIs that we are likely to
> implement with a higher performance. For example, I think sqlite
> doesn't support having multiple write transactions to the same
> database, even if those are to different tables.


FWIW: The work around for this is putting each object store in its own
database.


> Thus the whole API of
> specifying which objectStores you want to include in a transaction is
> purely for future optimizations in at least implementations backed by
> sqlite.
>

It's funny you mention this because this level of transactions is after
several iterations of simplifying the design for the exact reasons I'm
arguing we should simplify the design here.

I especially think these APIs are worth it given that it's low cost to
> implement, and adds convenience value to users even if implementations
> aren't faster yet.
>

I really don't see much added convenience.  Doing |myCursor.value.id| is
really not that much harder than |myCursor.value|.  And low cost
of implementation is a bad reason to add API surface area.


Given that the key-returning versions of these functions are just
optimizations, at the very least, we should change the names though:

get->getKey (or maybe getPrimaryKey?)
openCursor->openKeyCursor (or maybe openPrimaryKeyCursor?)
getObject->get
openObjectCursor->openCursor

J
Received on Friday, 17 September 2010 09:47:16 UTC