- From: Dean Landolt <dean@deanlandolt.com>
- Date: Tue, 8 Mar 2011 00:23:51 -0500
- To: Joran Greef <joran@ronomon.com>
- Cc: public-webapps@w3.org
- Message-ID: <AANLkTi=CUt9C8WGwKVD8hL1MrYfn1uUVqE-MT4QC+VC9@mail.gmail.com>
On Thu, Mar 3, 2011 at 4:15 AM, Joran Greef <joran@ronomon.com> wrote: > Hi Jonas > > I have been trying out your suggestion of using a separate object store to > do manual indexing (and so support compound indexes or index object > properties with arrays as values). > > There are some problems with this approach: > > 1. It's far too slow. To put an object and insert 50 index records (typical > when updating an inverted index) this way takes 100ms using IDB versus 10ms > using WebSQL (with a separate indexes table and compound primary key on > index name and object key). For instance, my application has a real > requirement to replicate 4,000,000 emails between client and server and I > would not be prepared to accept latencies of 100ms to store each object. > That's more than the network latency. > This doesn't seem right. Assuming your WebSQL implementation had all the same indexes isn't it doing pretty much the same things as using separate objectStores in IDB? Why would it be an order of magnitude slower? I'm sure whatever implementation you're using hasn't seen much optimization but you seem to be implying there's something more fundamental? The only thing I can think of to blame would be the fat in the objectStore interface -- like, for instance, the index building facilities. It seems to me your proposed solution is to add yet more fat to the interface (more complex indexing), but wouldn't it be just as suitable to instead strip down objectStores to their bare essentials to make them more suitable to act as indexes? Then the indexing functionality and all the hard decisions could be punted to libraries where they'd be free to innovate. This would simplify the spec greatly by lopping things like keyranges and schema enforcement out completely, not to mention the entire index cursor API that is similar to but slightly different than objectStore cursors. And yes, in combination with some fundamental relational operations (as you already proposed), and some stats, you could easily implement a relational db. And I don't see anything wrong with that. You could implement all kinds of databases -- and isn't that the point of this API? Is there any real limitation that I'm missing to making objectStores suitable as indexes -- performance or otherwise? Namespace collisions come mind (because of all the new objectStores that would have to be managed) but conventions seem to have served the world of sql just fine for this. > 2. It's a waste of space. > > Using a separate object store to do manual indexing may work in theory but > it does not work in practice. I do not think it can even be remotely > suggested as a panacea, however temporary it may be. > Why? You wouldn't necessarily have to store the whole object in each index, just the index key, a value and some pointer to the original source object. Something to resolve this pointer to the source would need to be spec'd (a la couchdb's include_docs), but that's simple. Even better, say it were possible to define a link relation on an object store that can resolve to its source object -- you could define a source link relation and the property to use -- and this would have the added bonus of being more broadly applicable than just linking an index record to its source instance. > > We can fix all of this right now very simply: > > 1. Enable objectStore.put and objectStore.delete to accept a setIndexes > option and an unsetIndexes option. The value passed for either option would > be an array (string list) of index references. > This would only work for indexes arrays of strings, right? Things can get much more complicated than that, and when they do you'd have to use an objectStore to do your indexing anyway, right? > > 2. The object would first be removed as a member from any indexes > referenced by the unsetIndexes option. Any referenced indexes which would be > empty thereafter would be removed. > > 3. The object would then be added as a member to any indexes referenced by > the setIndexes option. Any referenced indexes which do not yet exist would > be created. > > This would provide the much-needed indexing capabilities presently lacking > in IDB without sacrificing performance. > Why is it more theoretically performant than using objectStores in the raw? > > It would also enable developers to use IDB statefully (MySQL-like > pre-defined schemas with the DB taking on the complexities of schema > migration and data migration) or statelessly (See Berkeley DB with the > application responsible for the complexities of data maintenance) rather > than enforcing an assumption at such an early stage. > I don't necessarily understand the stateful vs. stateless distinction here. I don't see how your proposed solution removes the requirement for IDB to enforce constraints when certain indexes are present. Developers would already be able to use IDB statefully (with predefined schemas) -- they'd just use a library that has a schema mechanism. I doubt such a library for IDB already exists, but it'd be quite easy to port perstore, for instance, which is derived from the IDB API and already has this functionality using json-schema. There will no doubt be many ORM-like libraries that will pop up as soon as IDB starts to stabilize (or as soon as it gets a node.js implementation). ISTM giving library authors the freedom and flexibility to control their own indexes would be a huge win. They already have much of what they need fo this (though there are still a few gaps) but complicating the indexing without actually solving the problems would only serve to hamper users. If it's easy to implement, great, but I'm still left wondering why maintaining your own indexes is so slow -- this seems like *the* use case for IDB to really nail.
Received on Tuesday, 8 March 2011 05:24:24 UTC