W3C home > Mailing lists > Public > public-webapps@w3.org > July to September 2012

Re: IndexedDB and RegEx search

From: Alec Flett <alecflett@google.com>
Date: Thu, 9 Aug 2012 11:01:12 -0700
Message-ID: <CAHWpXeaRGDuzMHBnGA8AaaLhU1R7=2adyQUtx=M0_G8d-JNYmQ@mail.gmail.com>
To: Robin Berjon <robin@berjon.com>
Cc: Jonas Sicking <jonas@sicking.cc>, Yuval Sadan <sadan.yuval@gmail.com>, Michael Brooks <firealwaysworks@gmail.com>, public-webapps@w3.org
> > This is somewhat similar to [1] and something we decided was
> > out-of-scope for v1. But for v2 I definitely think we should look at
> > mechanisms for using JS code to filter/sort/index data in such a way
> > that the JS code is run on the IO thread.
> >
> > [1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=10000
>
> There's a lot of excellent prior art in CouchDB for what you're describing
> in that bug (or at least parts thereof). I think it's well worth looking at.
>
>
If I understand CouchDB correctly, much like the suggestion elsewhere in
this thread, CouchDB's views are really indexing primitives that support
callbacks - the callbacks are (more or less) run at most once per document
as the document is stored (or soon after) - rather than every time a cursor
is created/iterated. This means that cursor iteration can still be very
fast.

I could be wrong, but I theorize MOST of the use cases for filters are more
or less static/stateless, and that if you want to iterate once using a
specific stateless callback/filter, then you'll probably going to want to
iterate it again, many times. That particular usecase just begs for an
index. Meaning, you probably want have code something like:

objectStore.openCursor(function(value) { return value.foo > value.bar;
}).onsuccess = ...

this could be done with a callback-based index:

objectStore.createIndex("foobigger", function(value) { return value.foo >
value.bar });
objectStore.index("foobigger").openCursor(IDBKeyRange.only(true));

The next use case is for some kind of semi-static cursor, where the
function isn't stateless, but it's parameterized by another value:

var maxDifference = calculateMaxDifference()
objectStore.openCursor(function(value) { return (value.foo - value.bar) <
maxDifference; }).onsuccess = ...;

This too can be implemented/expressed with a callback-based index, such
that the check for "< maxDifference" is more of a range call:

objectStore.index("difference").openCursor(IDBKeyRange.upperBound(maxDifference))

the final case I see is something where the callback really is stateful:

objectStore.openCursor(function (value) { return (model.validate(value));
}).onsuccess = ...;

Assuming model is fairly dynamic and well out of scope of indexing (i.e.
validation can't be expressed on some linear scale that can be
range-queried with IDBKeyRange)

This is a MUCH harder problem that has all sorts of security issues that
would need to be thought through... but the other use cases could still be
addressed by indexes.

I think part of the overall problem is that it's really rather cumbersome
to create/remove indexes in IndexedDB - you need to change the database
version to trigger a versionchange event, etc... it would be much nicer if
there were ways to dynamically create them on the fly, or add them as
needed. This has been brought up here in other contexts...

I wonder if in IndexedDB v2 we could support creating indexes on the fly -
I think indexeddb is trying too hard to enforce some kind of schema
versioning that is tied to indexes, that handles a very strict usecase of
lock-step schema changes, but I'm not sure everyone really needs that. I
think that's a burden we should leave to consumers of the API.

I'd much rather be able to say, in any transaction:

if (!('myindex' in objectStore.indexNames) {
    objectStore.createIndex('myindex',....);
}

 Anyway, that's fodder for another thread :)

Alec


 --
> Robin Berjon - http://berjon.com/ - @robinberjon
>
>
Received on Thursday, 9 August 2012 18:02:05 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:54 GMT