Re: IndexedDB and RegEx search from Alec Flett on 2012-08-07 (public-webapps@w3.org from July to September 2012)

From: Alec Flett <alecflett@google.com>
Date: Tue, 7 Aug 2012 10:36:19 -0700
To: Michael Brooks <firealwaysworks@gmail.com>
Cc: public-webapps@w3.org
Message-ID: <CAHWpXeZCFA49YFjqvMJ=yDRBxJycq3NyopmLdc0qRRjmXyemsw@mail.gmail.com>

FWIW it's fairly hard to for a database to index arbitrary content for
regexes, to the point where it's going to be hard to do MUCH better than
simply filtering based on regex. Out of curiosity, would the "free text
search" feature on that wiki page that Arthur provided meet your needs?
it's more along the lines of what SQL 'LIKE' provides.

On Tue, Jul 31, 2012 at 11:17 AM, Michael Brooks
<firealwaysworks@gmail.com>wrote:

> I like IndexedDB and non-relational databases.  One feature that is very
> useful is the ability to search by regular expression:
>
> http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-RegularExpressions
>
>
from that page:

An index on the field queried by regexp can increase performance
significantly, as follows:

   - Simple prefix queries (also called rooted regexps) like /^prefix/ will
   make efficient use of the index (much like most SQL databases that use
   indexes for a LIKE 'prefix%' expression). This only works if the
   expression is left-rooted and the i (case-insensitivity) flag is not
   used.

   - All other queries will not make an efficient use of the index: all
   values in the index will be scanned and tested against the regular
   expression.

 While /^a/, /^a.*/, and /^a.*$/ are equivalent, they will have different
performance characteristics. The latter two will be slower as they have to
scan the whole string. The first format can stop scanning after the prefix
is matched.

> By not having this feature,  I can't port my application to IndexedDB from
> WebSQL because it relies upon LIKE statements for text search.  Text search
> could easily be replaced by a RegEx search,  however this feature is
> unfortunately not apart of IndexedDB.  It seems like this feature was just
> forgotten when writing the spec of IndexedDB.
>
>
One way to do this is to create a reverse index on the contents of the
table that you would normally use 'like' for - this is basically what a
freetext search would do. Create an objectStore on the side that maps
tokens (i.e. what "LIKE" would tokenize, words) to keys in another table.
This is the classic NoSQL way to solve this, and it's what SQL does under
the hood. (But I agree it would be nice for IndexedDB to just do this for
you!)

Alec

Received on Tuesday, 7 August 2012 17:37:11 UTC