Re: [IndexedDB] Posting lists/inverted indexes

Some quick comments.  I'll point some of our FTS experts at this as well.

On Thu, Jun 17, 2010 at 7:56 AM, Nikunj Mehta <> wrote:

> I would like to confirm the requirements for posting list and inverted
> index support in IndexedDB. To that extent, here is a short list ordered by
> importance. Please let me know if I have missed anything important.
> 1. Store sorted runs of terms and their occurrences in documents along with
> a payload.
>   a. Each occurrence is identified as some numeric value.
>   b. The payload is an opaque string value.
> 2. Look up a term to obtain its occurrences.
>   a. Look up produces a cursor, each value of which is the document ID
> where the term occurs and the corresponding payload
>   b. Full power of cursors as available in IndexedDB is present, i.e.,
> KeyRange and direction.
> 3. An inverted index could be linked to an object store, in which case, it
> is possible to look up objects using the inverted index.
> 4. When an object is removed from the object store linked to an inverted
> index, no automatic change management applies to inverted index. In other
> words, the inverted index is application managed.

I'm still not sure I agree with application managed Indexes in the spec at
all (see other threads).

> 5. Find co-occurrence of terms.
>   a. This would bring back the join feature that was present in earlier
> versions of the spec [1], although in a different API form than earlier.

Would it be practical to use inverted indexes without a join feature?  We
should probably try to be consistent

> 6. Store lexicon for IDF-type statistics
>   a. term-level statistics
> I am not sure if there is any point in specifying performance and
> efficiency goals in the spec.


> Nikunj
> [1]

Received on Thursday, 17 June 2010 16:50:28 UTC