Re: Points of order on this WG

On Jun 26, 2009, at 6:07 PM, Maciej Stachowiak wrote:

>
> On Jun 26, 2009, at 3:33 PM, Nikunj R. Mehta wrote:
>
>> I have a tutorial available to understand how one can use Berkeley  
>> DB to store data with multiple fields [1]. If you are only  
>> interested in understanding how to do look up by one or more of  
>> them, please skip to slide 51.
>>
>> If this doesn't help, I can write up another explanation for the  
>> issues that are outstanding.
>
> It sounds like the answer is to make multiple tables with additional  
> tables allowing secondary keys to map to the master key. Did I  
> understand that correctly? (I'm not sure I got the right idea from  
> the pictures).

That is correct. Any field that you want to use for fast lookup will  
need an index. In Berkeley DB an index is a secondary database, i.e.,  
whose values are updated atomically with the values stored in the main  
database. If there are multiple fields for looking up a database  
record, then each of those would have a secondary database.

>
> Can you clarify how a Berkley DB style API would differ from  
> LocalStorage in interface or capabilities? What would it be able to  
> do that LocalStorage can't?

There are two styles we could use for this API - a lower-level B-tree  
API, or a higher-level object persistence API that is built on top of  
B-tree API. The former would be powerful, but tedious for JavaScript  
developers, if they want to manage lots of different key fields in  
objects. However, here I am focusing solely on the lower-level API.

There are many differences between the two:

1. LocalStorage doesn't allow key searching
2. LocalStorage doesn't allow duplicate values for a key
3. LocalStorage doesn't allow look up by one or more value parts
4. LocalStorage doesn't support transactions

To see details of the difference, let's start with looking up an item,  
by exact key match or by key prefix match:

value = database.get(key)
key_value = database.search(key_prefix)

Here key_value would be of the form:

{ key: "some key value", value: some_object_or_null }

Alternately, if I want to retrieve multiple items, I would obtain a  
Cursor:

cursor = database.getCursor()

Once the cursor is available, I can initialize it by placing a  
starting point in the cursor in one of several ways:

key_value = cursor.searchKey(key)
key_value = cursor.searchKeyRange(key_prefix)
key_value = cursor.searchBoth(key, value)
key_value = cursor.searchBothRange(key, value_prefix)

I can step through the cursor with the following set of mechanisms:

key_value = cursor.getPrev()
key_value = cursor.getNext()
key_value = cursor.getFirst()
key_value = cursor.getLast()
key_value = cursor.getCurrent()
key_value = cursor.getNextDup()
key_value = cursor.getNextNoDup()
key_value = cursor.getPrevDup()
key_value = cursor.getPrevNoDup()

A fast count of the records in the database can be obtained via

count = cursor.count

Multiple cursors can be joined to AND multiple search criteria using a  
JoinCursor

cursor = database.join(cursors)

Or I can obtained a sorted cursor as:

cursor = database.join(cursors, true)

Only two operations are allowed on a JoinCursor:

key_value = cursor.getNext()
key = cursor.getNextKey()

Mutation operations can be performed similar to LocalStorage:

status = database.put(key, value)
status = database.delete(key)

Additionally, I can also deal with duplicates as:

status = database.putNoDupData(key, value)
status = database.putNoOverwrite(key, value)

Databases also provide a sequence in order to generate a sequence of  
monotonically increasing values

sequence = database.getSequence()
sequence = database.getSequence(name)

A sequence can generate values by:

value = sequence.get(delta)

A database can be obtained using a DOM mechanism such as:

environment = window.storageEnvironment(name)
database = environment.database(name)

Optionally, one of several supported options may be provided on this  
call. For example:

database = environment.database(name, { read_only: false, sorted_dups:  
true })

A secondary database can also be created from an environment as:

secondary = environment.secondaryDatabase(name, primary, keyCreator)

The keyCreator itself is a JS function of the following kind:

function secKeyCreator(secondary, key, data) {
    return result;
}

More options can also be specified when creating a secondary database:

secondary = environment.secondaryDatabase(name, primary, keyCreator,  
{fk_delete_action: 0, null_value: null})

A transaction may be used in conjunction with certain operations:

value = database.get(key, txn)
sequence = database.getSequence(txn)
value = sequence.get(delta, txn)
status = database.put(key, value, txn)

To obtain a transaction, we start from the environment

txn = environment.beginTransaction() or
txn = environment.beginTransaction(parent)

Once a transaction is complete, two operations are possible:

txn.abort()
txn.commit()

We may have to explore an async call back on this API since the  
operation can take time.

Certain operations above would also support optional lock modes. So, I  
could also have

value = database.get(key, txn, lock_mode)

cursor = database.getCursor(txn)
cursor = database.getCursor(txn, lock_mode)

Each of the cursor step calls could use a lock mode. For example,

key_value = cursor.searchKey(key, lock_mode)

Keys and values are compared lexicographically. We can explore a more  
complex API where other alternatives are available.

  

Received on Saturday, 27 June 2009 17:01:33 UTC