Re: [IndexedDB] Two Real World Use-Cases from Joran Greef on 2011-03-08 (public-webapps@w3.org from January to March 2011)

From: Joran Greef <joran@ronomon.com>
Date: Tue, 8 Mar 2011 08:33:11 +0200
To: Dean Landolt <dean@deanlandolt.com>
Cc: public-webapps@w3.org
Message-Id: <127FAA93-324A-4DAC-8457-E2B4E960FBAF@ronomon.com>
On 08 Mar 2011, at 7:23 AM, Dean Landolt wrote:

> This doesn't seem right. Assuming your WebSQL implementation had all the same indexes isn't it doing pretty much the same things as using separate objectStores in IDB? Why would it be an order of magnitude slower? I'm sure whatever implementation you're using hasn't seen much optimization but you seem to be implying there's something more fundamental? The only thing I can think of to blame would be the fat in the objectStore interface -- like, for instance, the index building facilities. It seems to me your proposed solution is to add yet more fat to the interface (more complex indexing), but wouldn't it be just as suitable to instead strip down objectStores to their bare essentials to make them more suitable to act as indexes? Then the indexing functionality and all the hard decisions could be punted to libraries where they'd be free to innovate.

Exactly. It's not what one would expect, and indication of the poor state of the IDB implementation (which is essentially a wrapper around SQLite anyway).

If someone is advising that object stores be used to handle indexes then may I be the first to raise a red flag and say that IDB is failing us (and it would have been better for the spec team to provide a locking mechanism for LocalStorage so it could be used in that way). The whole point of IDB as far as I can see is to provide transactional indexed access to a key value store.

> Why? You wouldn't necessarily have to store the whole object in each index, just the index key, a value and some pointer to the original source object. Something to resolve this pointer to the source would need to be spec'd (a la couchdb's include_docs), but that's simple. Even better, say it were possible to define a link relation on an object store that can resolve to its source object -- you could define a source link relation and the property to use -- and this would have the added bonus of being more broadly applicable than just linking an index record to its source instance.

Think of the object creation and JSON serialization/deserialization overhead for putting 50 indexes and you have got more than enough waste there already.

> We can fix all of this right now very simply:
> 
> 1. Enable objectStore.put and objectStore.delete to accept a setIndexes option and an unsetIndexes option. The value passed for either option would be an array (string list) of index references.
> 
> This would only work for indexes arrays of strings, right? Things can get much more complicated than that, and when they do you'd have to use an objectStore to do your indexing anyway, right?

No it would work for pretty much anything. The application would be free to determine the indexes, and also to convert query parameters into indexes when querying. It's essentially "computed indexes" without the hassles of IDB trying to do it (there was an interesting thread last year on the challenges of storing am index computing function in IDB).

> Why is it more theoretically performant than using objectStores in the raw?

It's a more direct interface. Think about it for a second. Using objectStores in the raw is interpolating O(n) complexity with multiple function calls, to give just one reason. If IDB can receive a list of indexes to add and remove an object to and from, then it can also do things like perform a set difference first to save unnecessary IO. I have written a database or two with this technique and it's certainly faster.

> I don't necessarily understand the stateful vs. stateless distinction here. I don't see how your proposed solution removes the requirement for IDB to enforce constraints when certain indexes are present. Developers would already be able to use IDB statefully (with predefined schemas) -- they'd just use a library that has a schema mechanism. I doubt such a library for IDB already exists, but it'd be quite easy to port perstore, for instance, which is derived from the IDB API and already has this functionality using json-schema. There will no doubt be many ORM-like libraries that will pop up as soon as IDB starts to stabilize (or as soon as it gets a node.js implementation).

The trouble is you always think a database would "be quite easy" until you actually try to do it yourself. At first when I dug into IDB I didn't think there would be any problems that could not be handled in some way. I have actually switched back to WebSQL now and will encourage my users to use Safari or Chrome as long as these browsers support WebSQL (and I hope Chrome will at least finish up by adding a quota interface for WebSQL). IDB right now is like a completely neutered slower SQLite without any of the benefits to be expected of a transactional indexed KV store. It's really sad.

For examples of stateless databases see the interfaces for Redis (the best example, and a perfect target for IDB), Berkeley, Tokyo. For a statefull database see MySql (and read this by Bret Taylor on the subject http://bret.appspot.com/entry/how-friendfeed-uses-mysql). I can understand how IDB just inherited this idea of pre-defined indexes from SQL. But I think it's an assumption that must be challenged given the complexity it involves and the greater power, flexibility, and simplicity to be had from a stateless database.

> ISTM giving library authors the freedom and flexibility to control their own indexes would be a huge win. They already have much of what they need fo this (though there are still a few gaps) but complicating the indexing without actually solving the problems would only serve to hamper users. If it's easy to implement, great, but I'm still left wondering why maintaining your own indexes is so slow -- this seems like the use case for IDB to really nail.

I think we both want the same thing. Making IDB stateless is the best step towards providing something flexible that library authors can work on top of. But this does not appear to be the current goal of IDB, which wants to try and tackle things like application state, computing indexes, migrations, the whole shebang (all of which seems to be becoming more and more the jurisdiction of the application), instead of directly addressing the original goal of providing a transactional indexed key value store. IDB is about as high-level as any low-level API could be right now.
Received on Tuesday, 8 March 2011 06:33:51 UTC