W3C home > Mailing lists > Public > public-webapps@w3.org > July to September 2012

Re: IndexedDB and RegEx search

From: Jonas Sicking <jonas@sicking.cc>
Date: Wed, 8 Aug 2012 16:39:40 -0700
Message-ID: <CA+c2ei9yibKvDRhhQpuHUCXpJ4SECxviCOtwFzyPr=VcXEBwkA@mail.gmail.com>
To: Yuval Sadan <sadan.yuval@gmail.com>
Cc: Alec Flett <alecflett@google.com>, Michael Brooks <firealwaysworks@gmail.com>, public-webapps@w3.org
On Wed, Aug 8, 2012 at 1:33 AM, Yuval Sadan <sadan.yuval@gmail.com> wrote:
>
> On Tue, Aug 7, 2012 at 8:36 PM, Alec Flett <alecflett@google.com> wrote:
>>
>> FWIW it's fairly hard to for a database to index arbitrary content for
>> regexes, to the point where it's going to be hard to do MUCH better than
>> simply filtering based on regex.
>
> Perhaps it shouldn't be a full-text *index* but simply a search feature.
> Though I'm unfamiliar with specific implementations, I gather that filtering
> records in native code would save (possibly lots of) redundant JS object
> construction (time and memory = money :)), and doing so with a pre-compiled
> regex might improve over certain JS implementation or non-optimizable
> practices, e.g.
> function search(field, s) {
>   someCallToIndexedDb(function filter(record) {
>     var re = new RegExp(s);
>     return !re.test(record[field]);
>   }
> }
>
> Plus it saves some code jumbling for a rather common practice.

The main thing you'd save is having to round-trip between threads for
each record. I think a more general feature that would be more
interesting would be to be able to iterate an index or objectStore
using a cursor, but at the time of constructing the cursor be able to
provide a javascript function which can be used to filter the data.
Unfortunately javascript doesn't have a good way of executing a
function in such a way that it doesn't pull in a lot of context, but
it's possible to hack this, for example by passing a string which
contains the javascript code.

This is somewhat similar to [1] and something we decided was
out-of-scope for v1. But for v2 I definitely think we should look at
mechanisms for using JS code to filter/sort/index data in such a way
that the JS code is run on the IO thread.

[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=10000

/ Jonas
Received on Wednesday, 8 August 2012 23:40:39 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:54 GMT