Re: [Bug 12321] New: Add compound keys to IndexedDB from Jeremy Orlow on 2011-03-18 (public-webapps@w3.org from January to March 2011)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Fri, 18 Mar 2011 10:48:30 -0700
To: Keean Schupke <keean@fry-it.com>
Cc: bugzilla@jessica.w3.org, public-webapps@w3.org
Message-ID: <AANLkTi=ojmWQDZgqhbjT1p5AF+4aGWQC7w7M-F6LumJW@mail.gmail.com>
On Fri, Mar 18, 2011 at 1:45 AM, Keean Schupke <keean@fry-it.com> wrote:

> I like BDB's solution. You have one primary key you cannot mess with (say
> an integer for fast comparisons) you can then add any number of secondary
> indexes. With a secondary index there is a callback to generate a binary
> blob that is used for indexing. The callback has access to all the fields of
> the object plus any info in the closure and can use that to generate the
> index data any way it likes.


We discussed this a while ago.  IIRC, we decided to look at something like
it for v2.  It sounds like a good, general way to solve the problem though.
 And given the other discussion in this thread, it sounds like maybe this
isn't a super important use case to fix in the mean time.

J


> This has the advantage of supporting any indexing scheme's the user may
> wish to implement (by writing a custom callback), whist allowing a few
> common options to be provided for the user (say a hash of all fields, or a
> field name, international char set, and direction captured in a closure).
> The user gets the power, the core implementation is simple, and common cases
> can be implemented in an easy to use way.
>
> var lex_order = function(field, charset, direction) {return
> function(object) {/* map indexed 'field' to blob in required order */ return
> key;};};
>
> Then create a new index:
>
> object_store.validate_index(1, lex_order('name', 'us',
> 'ascending')).on_done(function(status) {/* status ok or error */})
>
> validate index checks if the requested secondary index (1) exists, if it
> does not it creates the index and calls the done callback (with a status
> code indicating successful creation), if it does and it passes some
> validation checks it also calls the done callback (with a status code
> indicating successful validation). If anything goes wrong with either the
> creation or validation of the secondary index if would call the done
> callback with an error status code.
>
>
> Cheers,
> Keean.
>
>
> On 18 March 2011 02:03, Jeremy Orlow <jorlow@chromium.org> wrote:
>
>> Here's one ugliness with A: There's no way to specify ascending
>> or descending for the individual components of the key.  So there's no way
>> for me to open a cursor that looks at one field ascending and the other
>> field descending.  In addition, I can't think of any easy/good ways to hack
>> around this.
>>
>> Any thoughts on how we could address this use case?
>>
>> J
>>
>> On Wed, Mar 16, 2011 at 4:50 PM, <bugzilla@jessica.w3.org> wrote:
>>
>>> http://www.w3.org/Bugs/Public/show_bug.cgi?id=12321
>>>
>>>           Summary: Add compound keys to IndexedDB
>>>           Product: WebAppsWG
>>>           Version: unspecified
>>>          Platform: PC
>>>        OS/Version: All
>>>            Status: NEW
>>>          Severity: normal
>>>          Priority: P2
>>>         Component: Indexed Database API
>>>        AssignedTo: dave.null@w3.org
>>>        ReportedBy: jorlow@chromium.org
>>>         QAContact: member-webapi-cvs@w3.org
>>>                CC: mike@w3.org, public-webapps@w3.org
>>>
>>>
>>> >From the thread "[IndexedDB] Compound and multiple keys" by Jonas
>>> Sicking,
>>> we're going to go with both options A and B.
>>>
>>> =========================
>>>
>>> Hi IndexedDB fans (yay!!),
>>>
>>> Problem description:
>>>
>>> One of the current shortcomings of IndexedDB is that it doesn't
>>> support compound indexes. I.e. indexing on more than one value. For
>>> example it's impossible to index on, and therefor efficiently search
>>> for, firstname and lastname in an objectStore which stores people. Or
>>> index on to-address and date sent in an objectStore holding emails.
>>>
>>> The way this is traditionally done is that multiple values are used as
>>> key for each individual entry in an index or objectStore. For example
>>> the CREATE INDEX statement in SQL can list multiple columns, and
>>> CREATE TABLE statment can list several columns as PRIMARY KEY.
>>>
>>> There have been a couple of suggestions how to do this in IndexedDB
>>>
>>> Option A)
>>> When specifying a key path in createObjectStore and createIndex, allow
>>> an array of key-paths to be specified. Such as
>>>
>>> store = db.createObjectStore("mystore", ["firstName", "lastName"]);
>>> store.add({firstName: "Benny", lastName: "Zysk", age: 28});
>>> store.add({firstName: "Benny", lastName: "Andersson", age: 63});
>>> store.add({firstName: "Charlie", lastName: "Brown", age: 8});
>>>
>>> The records are stored in the following order
>>> "Benny", "Andersson"
>>> "Benny", "Zysk"
>>> "Charlie", "Brown"
>>>
>>> Similarly, createIndex accepts the same syntax:
>>> store.createIndex("myindex", ["lastName", "age"]);
>>>
>>> Option B)
>>> Allowing arrays as an additional data type for keys.
>>> store = db.createObjectStore("mystore", "fullName");
>>> store.add({fullName: ["Benny", "Zysk"], age: 28});
>>> store.add({fullName: ["Benny", "Andersson"], age: 63});
>>> store.add({fullName: ["Charlie", "Brown"], age: 8});
>>>
>>> Also allows out-of-line keys using:
>>> store = db.createObjectStore("mystore");
>>> store.add({age: 28}, ["Benny", "Zysk"]);
>>> store.add({age: 63}, ["Benny", "Andersson"]);
>>> store.add({age: 8}, ["Charlie", "Brown"]);
>>>
>>> (the sort order here is the same as in option A).
>>>
>>> Similarly, if an index pointed used a keyPath which points to an
>>> array, this would create an entry in the index which used a compound
>>> key consisting of the values in the array.
>>>
>>> There are of course advantages and disadvantages with both options.
>>>
>>> Option A advantages:
>>> * Ensures that at objectStore/index creation time the number of keys
>>> are known. This allows the implementation to create and optimize the
>>> index using this information. This is especially useful in situations
>>> when the indexedDB implementation is backed by a SQL database which
>>> uses columns as a way to represent multiple keys.
>>> * Easy to use when key values appear as separate properties on the
>>> stored object.
>>> * Obvious how to sort entries.
>>>
>>> Option A disadvantages:
>>> * Doesn't allow compound out-of-line keys.
>>> * Requires multiple properties to be added to stored objects if the
>>> components of the key isn't available there (for example if it's
>>> out-of-line or stored in an array).
>>>
>>> Option B advantages:
>>> * Allows compound out-of-line keys.
>>> * Easy to use when the key values are handled as an array by other
>>> code. Both when using in-line and out-of-line keys.
>>> * Maximum flexibility since you can combine single-value keys and
>>> compound keys in one objectStore, as well as arrays of different
>>> length (we couldn't come up with use cases for this though).
>>>
>>> Option B disadvantages:
>>> * Requires defining sorting between single values and arrays, as well
>>> as between arrays of different length.
>>> * Requires a single property to be added to stored objects if the key
>>> isn't available there (for example if it's stored as separate
>>> properties).
>>>
>>> There is of course a third alternative: Do both Option A and Option B.
>>> This brings most of the advantages of both options, but also many of
>>> the disadvantages of both. It also adds a lot of API surface which
>>> could conflict with future features, so it's something I'd really like
>>> to avoid.
>>>
>>>
>>> Questions:
>>>
>>> The main question we had if there is a use case for having different
>>> number of compound-key-values for the entries in a index or
>>> objectStore? I.e. is there a case when you, in one objectStore, want
>>> to have one record with a compound key consisting of 2 different
>>> values, and another record consisting of 3, 4 or 5?
>>>
>>> In all the cases where I've used compound keys, each key-part has been
>>> vital. For example a table storing sale totals by quarter, country and
>>> price class. In this case it's obviously always going to be 3 parts to
>>> the compound key. Does anyone have counter examples?
>>>
>>> Similarly, are there use cases which require compound keys that
>>> doesn't have a hard limit on the number of values? I.e. where you
>>> could find out more and more detail about an item and describe that by
>>> adding additional values to the key.
>>>
>>> Another question is if there are databases out there which allow using
>>> arrays as keys, similar to option B above. It seems particularly
>>> likely to find "NoSQL" databases that uses this. None of the SQL
>>> databases we looked at allowed keying off of arrays, which isn't
>>> terribly surprising since SQL databases tend to create compound keys
>>> using separate columns, rather than multiple values in a single
>>> column.
>>>
>>>
>>> Suggested solutions:
>>>
>>> I'm currently leaning towards option A above. However I'd love to get
>>> input from people with more database experience than me (especially
>>> since mine is very SQL based), before that I don't have strong
>>> opinions either way.
>>>
>>> --
>>> Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
>>> ------- You are receiving this mail because: -------
>>> You are on the CC list for the bug.
>>>
>>>
>>
>
Received on Friday, 18 March 2011 17:49:20 UTC