Re: [WebSimpleDB] Allowing schema operations anywhere from Nikunj R. Mehta on 2009-12-08 (public-webapps@w3.org from October to December 2009)

From: Nikunj R. Mehta <nikunj.mehta@oracle.com>
Date: Tue, 8 Dec 2009 10:14:53 -0800
To: Pablo Castro <Pablo.Castro@microsoft.com>
Cc: "public-webapps@w3.org" <public-webapps@w3.org>
Message-Id: <3F51976B-005E-4C35-8424-D0598187191F@oracle.com>

Hi Pablo,

Sorry for the long delay in responding to your comments. Hopefully, we  
can continue the discussion now.

Schema changes interact with the locking model of the database. As I  
see it, here are several ways in which the API could be designed and  
the consequences of doing so:

A. Allow schema changes inside a metadata transaction which can only  
be performed at connection time
B. Allow schema changes inside a data transaction, which can be  
performed any time a connection is open
C. Allow schema changes inside a metadata transaction, which can be  
performed any time a connection is open

Option A's disadvantages are that metadata manipulation cannot be  
combined with data changes. Moreover, version numbers are no longer  
issued by the application but rather by a user agent.

Option A's advantages are that resource acquisition is simplified and  
deadlocks can be avoided considering that a connection acquires and  
releases the metadata resource in a consistent sequence. Another  
upside is that version number maintenance is automated.

Option B's main disadvantage is that there is no real notion of  
version that can be managed by the user agent. Another is that  
deadlocks could occur because there is no a priori declaration of  
intent about metadata modification. This could be remedied by  
including the database itself in the list of objects that are intended  
to be modified in the transaction.

Option B's advantages are closer interleaving of and atomic metadata  
changes with data changes, and application controlled version numbers  
used for the database.

Option C's disadvantage is that data and metadata changes cannot be  
interleaved atomically.

Option C's advantages are that deadlocks can be avoided and version  
number management can be performed  by an application.

Overall, I think version management and metadata changes are exclusive  
in some sense. IOW, if we want Option B and Option C, then we have to  
remove the connection time version check.

Hope that helps. Please feel free to add if I missed anything.

Nikunj

On Nov 22, 2009, at 3:14 PM, Pablo Castro wrote:

> We are finding a number of reasons for wanting to create tables on  
> the fly, and without bumping up the database version. A few examples:
> - Packaged components that create side tables to maintain its own  
> state
> - Query processors often need to "spill to disk" during query  
> execution. For example, sorting large sets requires storing  
> temporary sets of rows on disk to be merged later.
>
> So we're thinking it would be better to have these methods directly  
> in the DatabaseSync/DatabaseAsync objects (with proper corresponding  
> patterns), instead of their current location in the Upgrade interface.
>
> For the common case where several schema changes need to be done  
> atomically, developers can simply wrap the calls in a transaction,  
> and they would do for regular data manipulation.
>
> We would need an extra method to bump up the version explicitly, as  
> that would no longer be in the upgrade callback.
>
> Does this seem reasonable?
>
> Regards,
> -pablo
>
>

Nikunj
http://o-micron.blogspot.com

Received on Tuesday, 8 December 2009 18:15:59 UTC