Re: [IndexedDB] Two Real World Use-Cases

On Thu, Mar 3, 2011 at 4:15 AM, Joran Greef <joran@ronomon.com> wrote:

> Hi Jonas
>
> I have been trying out your suggestion of using a separate object store to
> do manual indexing (and so support compound indexes or index object
> properties with arrays as values).
>
> There are some problems with this approach:
>
> 1. It's far too slow. To put an object and insert 50 index records (typical
> when updating an inverted index) this way takes 100ms using IDB versus 10ms
> using WebSQL (with a separate indexes table and compound primary key on
> index name and object key). For instance, my application has a real
> requirement to replicate 4,000,000 emails between client and server and I
> would not be prepared to accept latencies of 100ms to store each object.
> That's more than the network latency.
>


This doesn't seem right. Assuming your WebSQL implementation had all the
same indexes isn't it doing pretty much the same things as using separate
objectStores in IDB? Why would it be an order of magnitude slower? I'm sure
whatever implementation you're using hasn't seen much optimization but you
seem to be implying there's something more fundamental? The only thing I can
think of to blame would be the fat in the objectStore interface -- like, for
instance, the index building facilities. It seems to me your proposed
solution is to add yet more fat to the interface (more complex indexing),
but wouldn't it be just as suitable to instead strip down objectStores to
their bare essentials to make them more suitable to act as indexes? Then the
indexing functionality and all the hard decisions could be punted to
libraries where they'd be free to innovate.

This would simplify the spec greatly by lopping things like keyranges and
schema enforcement out completely, not to mention the entire index cursor
API that is similar to but slightly different than objectStore cursors. And
yes, in combination with some fundamental relational operations (as you
already proposed), and some stats, you could easily implement a relational
db. And I don't see anything wrong with that. You could implement all kinds
of databases -- and isn't that the point of this API? Is there any real
limitation that I'm missing to making objectStores suitable as indexes --
performance or otherwise? Namespace collisions come mind (because of all the
new objectStores that would have to be managed) but conventions seem to have
served the world of sql just fine for this.


> 2. It's a waste of space.
>
> Using a separate object store to do manual indexing may work in theory but
> it does not work in practice. I do not think it can even be remotely
> suggested as a panacea, however temporary it may be.
>

Why? You wouldn't necessarily have to store the whole object in each index,
just the index key, a value and some pointer to the original source object.
Something to resolve this pointer to the source would need to be spec'd (a
la couchdb's include_docs), but that's simple. Even better, say it were
possible to define a link relation on an object store that can resolve to
its source object -- you could define a source link relation and the
property to use -- and this would have the added bonus of being more broadly
applicable than just linking an index record to its source instance.


>
> We can fix all of this right now very simply:
>
> 1. Enable objectStore.put and objectStore.delete to accept a setIndexes
> option and an unsetIndexes option. The value passed for either option would
> be an array (string list) of index references.
>

This would only work for indexes arrays of strings, right? Things can get
much more complicated than that, and when they do you'd have to use an
objectStore to do your indexing anyway, right?


>
> 2. The object would first be removed as a member from any indexes
> referenced by the unsetIndexes option. Any referenced indexes which would be
> empty thereafter would be removed.
>
> 3. The object would then be added as a member to any indexes referenced by
> the setIndexes option. Any referenced indexes which do not yet exist would
> be created.
>
> This would provide the much-needed indexing capabilities presently lacking
> in IDB without sacrificing performance.
>

Why is it more theoretically performant than using objectStores in the raw?


>
> It would also enable developers to use IDB statefully (MySQL-like
> pre-defined schemas with the DB taking on the complexities of schema
> migration and data migration) or statelessly (See Berkeley DB with the
> application responsible for the complexities of data maintenance) rather
> than enforcing an assumption at such an early stage.
>

I don't necessarily understand the stateful vs. stateless distinction here.
I don't see how your proposed solution removes the requirement for IDB to
enforce constraints when certain indexes are present. Developers would
already be able to use IDB statefully (with predefined schemas) -- they'd
just use a library that has a schema mechanism. I doubt such a library for
IDB already exists, but it'd be quite easy to port perstore, for instance,
which is derived from the IDB API and already has this functionality using
json-schema. There will no doubt be many ORM-like libraries that will pop up
as soon as IDB starts to stabilize (or as soon as it gets a node.js
implementation).

ISTM giving library authors the freedom and flexibility to control their own
indexes would be a huge win. They already have much of what they need fo
this (though there are still a few gaps) but complicating the indexing
without actually solving the problems would only serve to hamper users. If
it's easy to implement, great, but I'm still left wondering why maintaining
your own indexes is so slow -- this seems like *the* use case for IDB to
really nail.

Received on Tuesday, 8 March 2011 05:24:24 UTC