[IndexedDB] Full text indexing

Hi all.

Disclaimer: last time I posted to this mailing list someone correctly
pointed out that I'd not read the spec properly. Apologies if I've
done the same again. I'm very enthusiastic about the whole offline web
app thing, and as this is a public forum I thought I may as well fire
off a couple of questions. They might be daft, so I apologise in
advance if so.

Onwards.

At present I can't see any reference to full text indexing via the
IndexedDB API. Is this something that is specifically out-of-scope, or
is it included in the way indexes work but not explicitly stated?

I can think of a couple of scenarios in which this would be useful,
all of which fall under the heading of "offline search". Few examples
below:

-- Reference material:
If users can take parts of a website offline, at some point someone
will want to search that data. If I build an offline application which
takes a stack of reference material offline, I'd also like to build a
database containing text from those pages. I can then use full-text
search to retrieve URL's of the offline pages and direct the user to
them.

-- Emails
Let's say a user takes their mailbox offline. Now they want to search
it for a particular phrase or subject. What feature of IndexedDB would
we expect developers to leverage to implement this?

While full text search would be possible with a regular index of
single keywords, this approach isn't as elegant as full-text indexing:

  * Searching for multiple keywords would probably be a second/third
query + join, which would be slow
  * Initially populating the database with individual keywords would
require the user to download a lot of data, whereas populating a
full-text index with a sentence would be more efficient (in some
(most?) scenarios).
  * A full-text index could expose more advanced functionality such as
searching for quoted terms, and other conditional operators (see Gears
implementation of full text search
[http://code.google.com/apis/gears/api_database.html - bottom of the
page]).

Unless this has been considered already, might I suggest either
extending KeyRange to include a "Match" property? Or perhaps introduce
a level of abstraction to KeyRanges along the lines of:

IRange (internal)
  - bool IncludeInResult( itemInIndex );

KeyRange : inherits IRange
  // properties as per spec

TextRange : inherits IRange
  - DOMString Match

Sure you can think of something more appropriate, but that explains
what I'd like to accomplish.

Is this something that can already be achieved via the IndexedDB spec?
If not, could it be included without too much effort?

Appreciate all your hard work.

Nathan

Received on Wednesday, 21 July 2010 11:08:35 UTC