Re: [IndexedDB] Two Real World Use-Cases from Joran Greef on 2011-03-02 (public-webapps@w3.org from January to March 2011)

From: Joran Greef <joran@ronomon.com>
Date: Wed, 2 Mar 2011 08:35:52 +0200
To: Jeremy Orlow <jorlow@chromium.org>
Cc: public-webapps@w3.org
Message-Id: <7B4F539F-14B6-40E2-A4B0-92ECC5D1CDF4@ronomon.com>
On 01 Mar 2011, at 7:27 PM, Jeremy Orlow wrote:

> 1. Be able to put an object and pass an array of index names which must reference the object. This may remove the need for a complicated indexing spec (perhaps the reason why this issue has been pushed into the future) and give developers all the flexibility they need.
> 
> You're talking about having multiple entries in a single index that point towards the same primary key?  If so, then I strongly agree, and I think others agree as well.  It's mostly a question of syntax.  A while ago we brainstormed a couple possibilities.  I'll try to send out a proposal this week.  I think this + compound keys should probably be our last v1 features though.  (Though they almost certainly won't make Chrome 11 or Firefox 4, unfortunately, hopefully they'll be done in the next version of each, and hopefully that release with be fairly soon after for both.)

Yes, for example this user object { name: "Joran Greef", emails: ["joran@ronomon.com", "jorangreef@gmail.com"] } with indexes on the "emails" property, would be found in the "joran@ronomon.com" index as well as in the "jorangreef@gmail.com" index.

What I've been thinking though is that the problem even with formally specifying indexes in advance of object put calls, is that this pushes too much application model logic into the database layer, making the database enforce a schema (at least in terms of indexes). Of course IDB facilitates migrations in the form of setVersion, but most schema migrations are also coupled with changes to the data itself, and this would still have to be done by the application in any event. So at the moment IDB takes too much responsibility on behalf of the application (computing indexes, pre-defined indexes, pseudo migrations) and not enough responsibility for pure database operations (index intersections and index unions).

I would argue that things like migrations and schema's are best handled by the application, even if this is more work for the application, as most people will write wrappers for IDB in any event and IDB is supposed to be a core-level API. The acid-test must be that the database is oblivious to schemas or anything pre-defined or application-specific (i.e. stateless). Otherwise IDB risks being a database for newbies who wouldn't use it, and a database that others would treat as a KV anyway (see MySQL at FriendFeed).

A suggested interface then for putting or deleting objects, would be: objectStore.put(object, ["indexname1", "indexname2", "indexname3"]) and then IDB would need to ensure that the object would be referenced by the given index names. When removing the object, the application would need to provide the indexes again (or IDB could keep track of the indexes associated with an object).

Using a function to compute indexes would not work as this would entrap application-specific schema knowledge within the function (which would need to be persisted) and these may subsequently change in the application, which would then need a way to modify the function again. The key is that these things must be stateless.

The objects must be opaque to IDB (no need for serialization/deserialization overhead at the DB layer). Things like key-paths etc. could be removed and the object id just passed in to put or delete calls.

> 2. Be able to intersect and union indexes. This covers a tremendous amount of ground in terms of authorization and filtering.
> 
> Our plan was to punt some sort of join language to v2.  Could you give a more concrete proposal for what we'd add?  It'd make it easier to see if it's something realistic for v1 or not.

If you can perform intersect or union operations (and combinations of these) on indexes (which are essentially sets or sorted sets), then this would be the join language. It has the benefit that the interface would then be described in terms of operations on data structures (set operations on sets) rather than a custom language which would take longer to spec out.

I've written databases over append-only files, S3, WebSQL and even LocalStorage (!) and from what I've found with my own applications, you could handle everything from multi-tenant authorization to adequate filtering with the following operations:

1. intersect([ index1, index2 ])
2. union([ index1, index2 ])
3. intersect([ union([ index1, index2 ]), index3, index4, index5, index6, index7 ])

Hopefully, a join language described in terms of pure set operations would be much simpler to implement and easier to use and reason with.

In fact I think if IDB offered only a single object store and an indexing system described above, it would be completely perfect. That's all that's needed. No need for a V2. Just a focus on high-performance thereafter.
Received on Wednesday, 2 March 2011 06:36:30 UTC