[IndexedDB] Compound and multiple keys

Hi IndexedDB fans (yay!!),

Problem description:

One of the current shortcomings of IndexedDB is that it doesn't
support compound indexes. I.e. indexing on more than one value. For
example it's impossible to index on, and therefor efficiently search
for, firstname and lastname in an objectStore which stores people. Or
index on to-address and date sent in an objectStore holding emails.

The way this is traditionally done is that multiple values are used as
key for each individual entry in an index or objectStore. For example
the CREATE INDEX statement in SQL can list multiple columns, and
CREATE TABLE statment can list several columns as PRIMARY KEY.

There have been a couple of suggestions how to do this in IndexedDB

Option A)
When specifying a key path in createObjectStore and createIndex, allow
an array of key-paths to be specified. Such as

store = db.createObjectStore("mystore", ["firstName", "lastName"]);
store.add({firstName: "Benny", lastName: "Zysk", age: 28});
store.add({firstName: "Benny", lastName: "Andersson", age: 63});
store.add({firstName: "Charlie", lastName: "Brown", age: 8});

The records are stored in the following order
"Benny", "Andersson"
"Benny", "Zysk"
"Charlie", "Brown"

Similarly, createIndex accepts the same syntax:
store.createIndex("myindex", ["lastName", "age"]);

Option B)
Allowing arrays as an additional data type for keys.
store = db.createObjectStore("mystore", "fullName");
store.add({fullName: ["Benny", "Zysk"], age: 28});
store.add({fullName: ["Benny", "Andersson"], age: 63});
store.add({fullName: ["Charlie", "Brown"], age: 8});

Also allows out-of-line keys using:
store = db.createObjectStore("mystore");
store.add({age: 28}, ["Benny", "Zysk"]);
store.add({age: 63}, ["Benny", "Andersson"]);
store.add({age: 8}, ["Charlie", "Brown"]);

(the sort order here is the same as in option A).

Similarly, if an index pointed used a keyPath which points to an
array, this would create an entry in the index which used a compound
key consisting of the values in the array.

There are of course advantages and disadvantages with both options.

Option A advantages:
* Ensures that at objectStore/index creation time the number of keys
are known. This allows the implementation to create and optimize the
index using this information. This is especially useful in situations
when the indexedDB implementation is backed by a SQL database which
uses columns as a way to represent multiple keys.
* Easy to use when key values appear as separate properties on the
stored object.
* Obvious how to sort entries.

Option A disadvantages:
* Doesn't allow compound out-of-line keys.
* Requires multiple properties to be added to stored objects if the
components of the key isn't available there (for example if it's
out-of-line or stored in an array).

Option B advantages:
* Allows compound out-of-line keys.
* Easy to use when the key values are handled as an array by other
code. Both when using in-line and out-of-line keys.
* Maximum flexibility since you can combine single-value keys and
compound keys in one objectStore, as well as arrays of different
length (we couldn't come up with use cases for this though).

Option B disadvantages:
* Requires defining sorting between single values and arrays, as well
as between arrays of different length.
* Requires a single property to be added to stored objects if the key
isn't available there (for example if it's stored as separate
properties).

There is of course a third alternative: Do both Option A and Option B.
This brings most of the advantages of both options, but also many of
the disadvantages of both. It also adds a lot of API surface which
could conflict with future features, so it's something I'd really like
to avoid.


Questions:

The main question we had if there is a use case for having different
number of compound-key-values for the entries in a index or
objectStore? I.e. is there a case when you, in one objectStore, want
to have one record with a compound key consisting of 2 different
values, and another record consisting of 3, 4 or 5?

In all the cases where I've used compound keys, each key-part has been
vital. For example a table storing sale totals by quarter, country and
price class. In this case it's obviously always going to be 3 parts to
the compound key. Does anyone have counter examples?

Similarly, are there use cases which require compound keys that
doesn't have a hard limit on the number of values? I.e. where you
could find out more and more detail about an item and describe that by
adding additional values to the key.

Another question is if there are databases out there which allow using
arrays as keys, similar to option B above. It seems particularly
likely to find "NoSQL" databases that uses this. None of the SQL
databases we looked at allowed keying off of arrays, which isn't
terribly surprising since SQL databases tend to create compound keys
using separate columns, rather than multiple values in a single
column.


Suggested solutions:

I'm currently leaning towards option A above. However I'd love to get
input from people with more database experience than me (especially
since mine is very SQL based), before that I don't have strong
opinions either way.

/ Jonas

Received on Thursday, 2 December 2010 00:58:09 UTC