WebSimpleDB Issues from Kris Zyp on 2009-12-02 (public-webapps@w3.org from October to December 2009)

From: Kris Zyp <kris@sitepen.com>
Date: Tue, 01 Dec 2009 23:33:40 -0700
To: "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <4B160A44.8030806@sitepen.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
 
I had few thoughts/questions/issues with the WebSimpleDB proposal:

* No O(log n) access to position/counts in index sequences - If you
want find all the entities that have a price less than 10, it is quite
easy (assuming there is an index on that property) with the
WebSimpleDB to access the price index, iterate through the index until
you hit 10 (or vice versa). However, if this is a large set of data,
the vast majority of applications I have built involve providing a
subset or page of data at a time to keep constant or logarithmic time
access to data, *and* an indication of how many rows/items/entities
are available. Limiting a "query" to a certain number of items is easy
enough with WebSimple, you just only iterate so far, but determining
how many items are less than 10 is no longer a logarithmic complexity
problem, but is linear with the number of items that are less 10,
because you have to iterate over all of them to count them. If a
cursor were to indicate the numeric position within an index (even if
it was an estimate, without transactional strictness), one could
easily determine the count of these type of queries in O(log n) time.
This would presumably entail B-tree nodes keeping track of their
number of children.

* Asynchronicity is not well-aligned with the expensive operations -
The asynchronous actions are starting and committing transactions. It
makes sense that committing transactions would be expensive, but why
do we make the start of a transaction asynchronous? Is there an
expectation that the a global lock will be sought on the entire DB
when the transaction is started? That certainly doesn't seem
desirable. Normally a DB would create locks as data is accessed,
correct? If anything a "get" operation would be more costly than
starting a transaction.

* Hanging EntityStores off of transactions creates unnecessary
complexity in referencing stores - A typical pattern in applications
is to provide a reference to a store to a widget that will use it.
However, with the WebSimpleDB, you can't really just hand off a
reference to an EntityStore, since each store object is
transaction-specific. You would either need to pass the name of store
to a widget, and have it generate its own transaction to get a store
(which seems like very bad form from object capability perspective),
or pass in a store for every action, which may not be viable in many
situations.

Would it be reasonable (based on the last two points) to have
getEntityStore be a method on database objects, rather than
transaction objects? Actions would just take place in the current
transaction for the context. With the single-threaded nature of JS
contexts, having a single active transaction at a time doesn't not
seem like a hardship, but rather makes things a lot simpler to work
with. Also, if an impl wanted to support auto-commit, it would be very
intuitive, devs just wouldn't be required to start a transaction prior
performing actions on a store.
Thanks,

- --
Kris Zyp
SitePen
(503) 806-1841
http://sitepen.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
iEYEARECAAYFAksWCkMACgkQ9VpNnHc4zAyheACfY53gDNjZ4gqud8rqCPANk+O7
oJsAoIRaYfCjQK9gwaKCejPo76OBWbbE
=Jcp0
-----END PGP SIGNATURE-----
Received on Wednesday, 2 December 2009 06:34:49 UTC