Re: [IndexedDB] Two Real World Use-Cases from Keean Schupke on 2011-03-02 (public-webapps@w3.org from January to March 2011)

From: Keean Schupke <keean@fry-it.com>
Date: Wed, 2 Mar 2011 11:53:37 +0000
To: Jonas Sicking <jonas@sicking.cc>
Cc: Joran Greef <joran@ronomon.com>, Jeremy Orlow <jorlow@chromium.org>, public-webapps@w3.org
Message-ID: <AANLkTikZwbnU4QYezTcE=gtvvrJ=ouRLXW5e5THobc9p@mail.gmail.com>
On 2 March 2011 11:31, Jonas Sicking <jonas@sicking.cc> wrote:

> On Tue, Mar 1, 2011 at 10:35 PM, Joran Greef <joran@ronomon.com> wrote:
> > On 01 Mar 2011, at 7:27 PM, Jeremy Orlow wrote:
> >
> >> 1. Be able to put an object and pass an array of index names which must
> reference the object. This may remove the need for a complicated indexing
> spec (perhaps the reason why this issue has been pushed into the future) and
> give developers all the flexibility they need.
> >>
> >> You're talking about having multiple entries in a single index that
> point towards the same primary key?  If so, then I strongly agree, and I
> think others agree as well.  It's mostly a question of syntax.  A while ago
> we brainstormed a couple possibilities.  I'll try to send out a proposal
> this week.  I think this + compound keys should probably be our last v1
> features though.  (Though they almost certainly won't make Chrome 11 or
> Firefox 4, unfortunately, hopefully they'll be done in the next version of
> each, and hopefully that release with be fairly soon after for both.)
> >
> > Yes, for example this user object { name: "Joran Greef", emails: ["
> joran@ronomon.com", "jorangreef@gmail.com"] } with indexes on the "emails"
> property, would be found in the "joran@ronomon.com" index as well as in
> the "jorangreef@gmail.com" index.
> >
> > What I've been thinking though is that the problem even with formally
> specifying indexes in advance of object put calls, is that this pushes too
> much application model logic into the database layer, making the database
> enforce a schema (at least in terms of indexes). Of course IDB facilitates
> migrations in the form of setVersion, but most schema migrations are also
> coupled with changes to the data itself, and this would still have to be
> done by the application in any event. So at the moment IDB takes too much
> responsibility on behalf of the application (computing indexes, pre-defined
> indexes, pseudo migrations) and not enough responsibility for pure database
> operations (index intersections and index unions).
> >
> > I would argue that things like migrations and schema's are best handled
> by the application, even if this is more work for the application, as most
> people will write wrappers for IDB in any event and IDB is supposed to be a
> core-level API. The acid-test must be that the database is oblivious to
> schemas or anything pre-defined or application-specific (i.e. stateless).
> Otherwise IDB risks being a database for newbies who wouldn't use it, and a
> database that others would treat as a KV anyway (see MySQL at FriendFeed).
> >
> > A suggested interface then for putting or deleting objects, would be:
> objectStore.put(object, ["indexname1", "indexname2", "indexname3"]) and then
> IDB would need to ensure that the object would be referenced by the given
> index names. When removing the object, the application would need to provide
> the indexes again (or IDB could keep track of the indexes associated with an
> object).
> >
> > Using a function to compute indexes would not work as this would entrap
> application-specific schema knowledge within the function (which would need
> to be persisted) and these may subsequently change in the application, which
> would then need a way to modify the function again. The key is that these
> things must be stateless.
> >
> > The objects must be opaque to IDB (no need for
> serialization/deserialization overhead at the DB layer). Things like
> key-paths etc. could be removed and the object id just passed in to put or
> delete calls.
>
> I agree that we are currently enforcing a bit of schema due to the way
> indexes work. However I think it's a good approach for an initial
> version of this API as it covers the most simple use cases. Note that
> the more complex use cases are still very possible by simply using a
> separate objectStore as an index and manually add/remove things there.
>
> I still believe that using a function, which is persisted in the
> database, is very doable. And yes, the function needs to be stateless
> and it needs to be possible to change the set of functions which
> manage the set of indexes associated with a given objectStore
> (probably by simply allowing indexes to be created and removed, which
> is already the case).
>
> / Jonas
>
>
I would recommend against storing functions in the database (not saying it
should not be possible, but stored procedures obscure functionality, and
cause surprises which are both bad things IMHO). For this kind of thing I
would create a master index from object-id to object, and then create
multiple secondary indexes from property to object-id. Removing an object is
simply removing it from the master index. You would avoid the slow scan of
the secondary indexes (slow because you have to visit each object to delete
by value) by simply leaving the entries there, they would be filtered out of
any results because the object-id is no longer in the master-index (a fast
lookup). You would then occasionally do a scan of the secondary indexes to
remove several dead references in one go (maybe when short on space).

To do better than this, we would need proper multiple index support (where a
single object is indexed on multiple properties) - where each index is a
'column' and then you get 'rows' of properties which belong to the same
object. At which  point you are well on the way to re-inventing the
relational database (note: there is no requirement that a relational
database has a schema that's just what SQL does, objects that do not have an
indexed property would have 'null's in that column).

At this point it would pay to look at the extensive formalisations around
relational-databases and relational-algebra to make sure you don't re-invent
the wheel, and that any API developed is minimal, flexible, and familiar.


Cheers,
Keean.
Received on Wednesday, 2 March 2011 11:54:10 UTC