Re: Starting work on Indexed DB v2 spec - feedback wanted from Tim Caswell on 2014-04-17 (public-webapps@w3.org from April to June 2014)

From: Tim Caswell <tim@creationix.com>
Date: Thu, 17 Apr 2014 16:47:42 -0500
To: Domenic Denicola <domenic@domenicdenicola.com>
Cc: Joshua Bell <jsbell@google.com>, "public-webapps@w3.org" <public-webapps@w3.org>, Ali Alabbas <alia@microsoft.com>
Message-ID: <CAGkHjAXdTb6jKaz3LMMo9zpi2yZj-AiRkm2P7FDAEXN+5U6NUw@mail.gmail.com>

For my personal use case, I don't need much at all.  I need the ability to
store binary data by key and retrieve binary data by key.  It would be nice
if there was a quick existence check to see if a hash is already in the db
since I'm using content-addressable structures.  But again, only for
performance reasons.  If that wasn't there, I would just do a read and
ignore the body.

For the mutable half of a git db, I need basic string key with string value
and some iterating on subsets.  For example, I want to find all refs that
start with "refs/heads/" to get the list of branches on a repo.  But this
metadata is tiny in comparison to everything else.  A large git db may have
hundreds of megabytes of git objects and less than 1000 refs for all the
branches, tags, pull-requests, etc.  The value of these refs is a string
usually 41 bytes long.

If the db didn't provide sorted and iteratable keys, I would just store all
the metadata in a single object and replace it whenever I needed to change
any part of it.  Updates to this data is infrequent  in my case.

Ideally I'd have two databases.  One for immutable binary values with
fixed-length 20-byte binary (or 40 byte string) keys.  This wouldn't need
to be atomic at all.
It would have:

  get(hash) => binary
  set(hash, binary)
  has(hash) => boolean
  del(hash)

And one for the mutable metadata where the keys are strings and the values
are strings.

  get(key) => string
  set(key, string)
  getRange(prefix) => object of key/value pairs that match the prefix
  del(key)

This would all be async using ES6 promises or node-style callbacks or
whatever.

The reason I suggested LevelDB is because it's a nice medium between my
minimal needs and what's needed to implement a more complex database in
pure JS.

On Thu, Apr 17, 2014 at 4:04 PM, Domenic Denicola <
domenic@domenicdenicola.com> wrote:

>  *From:* Joshua Bell <jsbell@google.com>
>
>    > How much of leveldb's API you consider part of the minimum set -
> write batches? iterators? snapshots? custom comparators? multiple instances
> per application? And are IDB-style keys / serialized script values
> appropriate, or is that extra overhead over e.g. just strings?
>
> This is my question for Tim as well. My personal hope has always been for
> something along the lines of async local storage [1], but that omits a lot
> of LevelDB's API and complexity, so presumably it goes too far for LevelDB
> proponents.
>
> [1]: https://github.com/slightlyoff/async-local-storage
>

Received on Thursday, 17 April 2014 21:48:16 UTC