W3C home > Mailing lists > Public > public-webapps@w3.org > April to June 2010

[IndexedDB] Posting lists/inverted indexes

From: Nikunj Mehta <nikunj@o-micron.com>
Date: Thu, 17 Jun 2010 07:56:16 -0700
Message-Id: <CFE336B5-5A69-45AE-8520-8846BD868D09@o-micron.com>
To: public-webapps WG <public-webapps@w3.org>
I would like to confirm the requirements for posting list and inverted index support in IndexedDB. To that extent, here is a short list ordered by importance. Please let me know if I have missed anything important.

1. Store sorted runs of terms and their occurrences in documents along with a payload.
   a. Each occurrence is identified as some numeric value.
   b. The payload is an opaque string value.
2. Look up a term to obtain its occurrences.
   a. Look up produces a cursor, each value of which is the document ID where the term occurs and the corresponding payload
   b. Full power of cursors as available in IndexedDB is present, i.e., KeyRange and direction.
3. An inverted index could be linked to an object store, in which case, it is possible to look up objects using the inverted index.
4. When an object is removed from the object store linked to an inverted index, no automatic change management applies to inverted index. In other words, the inverted index is application managed.
5. Find co-occurrence of terms.
   a. This would bring back the join feature that was present in earlier versions of the spec [1], although in a different API form than earlier.
6. Store lexicon for IDF-type statistics
   a. term-level statistics

I am not sure if there is any point in specifying performance and efficiency goals in the spec. 

Nikunj

[1] http://www.w3.org/TR/2009/WD-WebSimpleDB-20090929/#entity-join
Received on Thursday, 17 June 2010 14:56:55 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:39 GMT