Re: [IndexedDB] Compound and multiple keys from Jeremy Orlow on 2011-01-20 (public-webapps@w3.org from January to March 2011)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Thu, 20 Jan 2011 10:33:35 +0000
To: Keean Schupke <keean@fry-it.com>
Cc: Jonas Sicking <jonas@sicking.cc>, Webapps WG <public-webapps@w3.org>
Message-ID: <AANLkTinweMQFhXzwjOzwNBi3HfgYskgKqqgvsG+jMfKY@mail.gmail.com>
Ok.  So what's the resolution?  Let's bug it!

On Fri, Dec 10, 2010 at 12:34 PM, Jeremy Orlow <jorlow@chromium.org> wrote:

> Any other thoughts on this issue?
>
>
> On Thu, Dec 2, 2010 at 7:19 AM, Keean Schupke <keean@fry-it.com> wrote:
>
>> I think I prefer A. Declaring the keys in advance is stating to sound a
>> little like a schema, and when you go down that route you end up at SQL
>> schemas (which is a good thing in my opinion). I understand however that
>> some people are not so comfortable with the idea of a schema, and these
>> people seem to be the kind of people that like IndexedDB. So, although I
>> prefer A for me, I would have to say B for IndexedDB.
>>
>> So in conclusion: I think "B" is the better choice for IndexedDB, as it is
>> more consistent with the design of IDB.
>>
>> As for the cons of "B", sorting an array is just like sorting a string,
>> and it already supports string types.
>>
>> Surely there is also option "C":
>>
>> store.add({firstName: "Benny", lastName: "Zysk", age: 28}, ["firstName",
>> "lastName"]);
>> store.add({firstName: "Benny", lastName: "Andersson", age:
>> 63}, ["firstName", "lastName"]);
>>
>> Like "A", but listing the properties to include in the composite index
>> with each add, therefore avoiding the "schema"...
>>
>>
>> As for layering the Relational API over the top, It doesn't make any
>> difference, but I would prefer whichever has the best performance.
>>
>>
>> Cheers,
>> Keean.
>>
>>
>> On 2 December 2010 00:57, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>>> Hi IndexedDB fans (yay!!),
>>>
>>> Problem description:
>>>
>>> One of the current shortcomings of IndexedDB is that it doesn't
>>> support compound indexes. I.e. indexing on more than one value. For
>>> example it's impossible to index on, and therefor efficiently search
>>> for, firstname and lastname in an objectStore which stores people. Or
>>> index on to-address and date sent in an objectStore holding emails.
>>>
>>> The way this is traditionally done is that multiple values are used as
>>> key for each individual entry in an index or objectStore. For example
>>> the CREATE INDEX statement in SQL can list multiple columns, and
>>> CREATE TABLE statment can list several columns as PRIMARY KEY.
>>>
>>> There have been a couple of suggestions how to do this in IndexedDB
>>>
>>> Option A)
>>> When specifying a key path in createObjectStore and createIndex, allow
>>> an array of key-paths to be specified. Such as
>>>
>>> store = db.createObjectStore("mystore", ["firstName", "lastName"]);
>>> store.add({firstName: "Benny", lastName: "Zysk", age: 28});
>>> store.add({firstName: "Benny", lastName: "Andersson", age: 63});
>>> store.add({firstName: "Charlie", lastName: "Brown", age: 8});
>>>
>>> The records are stored in the following order
>>> "Benny", "Andersson"
>>> "Benny", "Zysk"
>>> "Charlie", "Brown"
>>>
>>> Similarly, createIndex accepts the same syntax:
>>> store.createIndex("myindex", ["lastName", "age"]);
>>>
>>> Option B)
>>> Allowing arrays as an additional data type for keys.
>>> store = db.createObjectStore("mystore", "fullName");
>>> store.add({fullName: ["Benny", "Zysk"], age: 28});
>>> store.add({fullName: ["Benny", "Andersson"], age: 63});
>>> store.add({fullName: ["Charlie", "Brown"], age: 8});
>>>
>>> Also allows out-of-line keys using:
>>> store = db.createObjectStore("mystore");
>>> store.add({age: 28}, ["Benny", "Zysk"]);
>>> store.add({age: 63}, ["Benny", "Andersson"]);
>>> store.add({age: 8}, ["Charlie", "Brown"]);
>>>
>>> (the sort order here is the same as in option A).
>>>
>>> Similarly, if an index pointed used a keyPath which points to an
>>> array, this would create an entry in the index which used a compound
>>> key consisting of the values in the array.
>>>
>>> There are of course advantages and disadvantages with both options.
>>>
>>> Option A advantages:
>>> * Ensures that at objectStore/index creation time the number of keys
>>> are known. This allows the implementation to create and optimize the
>>> index using this information. This is especially useful in situations
>>> when the indexedDB implementation is backed by a SQL database which
>>> uses columns as a way to represent multiple keys.
>>> * Easy to use when key values appear as separate properties on the
>>> stored object.
>>> * Obvious how to sort entries.
>>>
>>> Option A disadvantages:
>>> * Doesn't allow compound out-of-line keys.
>>> * Requires multiple properties to be added to stored objects if the
>>> components of the key isn't available there (for example if it's
>>> out-of-line or stored in an array).
>>>
>>> Option B advantages:
>>> * Allows compound out-of-line keys.
>>> * Easy to use when the key values are handled as an array by other
>>> code. Both when using in-line and out-of-line keys.
>>> * Maximum flexibility since you can combine single-value keys and
>>> compound keys in one objectStore, as well as arrays of different
>>> length (we couldn't come up with use cases for this though).
>>>
>>> Option B disadvantages:
>>> * Requires defining sorting between single values and arrays, as well
>>> as between arrays of different length.
>>> * Requires a single property to be added to stored objects if the key
>>> isn't available there (for example if it's stored as separate
>>> properties).
>>>
>>> There is of course a third alternative: Do both Option A and Option B.
>>> This brings most of the advantages of both options, but also many of
>>> the disadvantages of both. It also adds a lot of API surface which
>>> could conflict with future features, so it's something I'd really like
>>> to avoid.
>>>
>>>
>>> Questions:
>>>
>>> The main question we had if there is a use case for having different
>>> number of compound-key-values for the entries in a index or
>>> objectStore? I.e. is there a case when you, in one objectStore, want
>>> to have one record with a compound key consisting of 2 different
>>> values, and another record consisting of 3, 4 or 5?
>>>
>>> In all the cases where I've used compound keys, each key-part has been
>>> vital. For example a table storing sale totals by quarter, country and
>>> price class. In this case it's obviously always going to be 3 parts to
>>> the compound key. Does anyone have counter examples?
>>>
>>> Similarly, are there use cases which require compound keys that
>>> doesn't have a hard limit on the number of values? I.e. where you
>>> could find out more and more detail about an item and describe that by
>>> adding additional values to the key.
>>>
>>> Another question is if there are databases out there which allow using
>>> arrays as keys, similar to option B above. It seems particularly
>>> likely to find "NoSQL" databases that uses this. None of the SQL
>>> databases we looked at allowed keying off of arrays, which isn't
>>> terribly surprising since SQL databases tend to create compound keys
>>> using separate columns, rather than multiple values in a single
>>> column.
>>>
>>>
>>> Suggested solutions:
>>>
>>> I'm currently leaning towards option A above. However I'd love to get
>>> input from people with more database experience than me (especially
>>> since mine is very SQL based), before that I don't have strong
>>> opinions either way.
>>>
>>> / Jonas
>>>
>>>
>>
>
Received on Thursday, 20 January 2011 10:52:56 UTC