W3C home > Mailing lists > Public > public-webapps@w3.org > October to December 2009

RE: [WebSimpleDB] Allowing schema operations anywhere

From: Pablo Castro <Pablo.Castro@microsoft.com>
Date: Tue, 22 Dec 2009 23:52:37 +0000
To: "Nikunj R. Mehta" <nikunj.mehta@oracle.com>, "public-webapps@w3.org WG" <public-webapps@w3.org>
Message-ID: <F753B2C401114141B426DB383C8885E0229050FF@TK5EX14MBXC124.redmond.corp.microsoft.com>
My apologies for my late reply, I've been out for a while.

> -----Original Message-----
> From: Nikunj R. Mehta [mailto:nikunj.mehta@oracle.com]
> Sent: Friday, December 11, 2009 10:47 AM
> To: public-webapps@w3.org WG
> Cc: Pablo Castro
> Subject: Re: [WebSimpleDB] Allowing schema operations anywhere
> 
> I have gone ahead and updated the spec to allow option B (only).
> Please take a look.

Option B makes sense, as without it there is a class of algorithms that cannot be implemented or it would be quite difficult to do so (e.g. a "sort" type of construct a query language might want to support wouldn't be possible without a backing index). 

This certainly means versioning becomes the responsibility of the app/library and not the user agent. This makes sense to me, given that not all schema changes are really version changes (e.g. creation of a spill-to-disk table shouldn't bump up the database version).

Thanks
-pablo

> 
> Nikunj
> On Dec 8, 2009, at 10:14 AM, Nikunj R. Mehta wrote:
> 
> > Hi Pablo,
> >
> > Sorry for the long delay in responding to your comments. Hopefully, we
> > can continue the discussion now.
> >
> > Schema changes interact with the locking model of the database. As I
> > see it, here are several ways in which the API could be designed and
> > the consequences of doing so:
> >
> > A. Allow schema changes inside a metadata transaction which can only
> > be performed at connection time B. Allow schema changes inside a data
> > transaction, which can be performed any time a connection is open C.
> > Allow schema changes inside a metadata transaction, which can be
> > performed any time a connection is open
> >
> > Option A's disadvantages are that metadata manipulation cannot be
> > combined with data changes. Moreover, version numbers are no longer
> > issued by the application but rather by a user agent.
> >
> > Option A's advantages are that resource acquisition is simplified and
> > deadlocks can be avoided considering that a connection acquires and
> > releases the metadata resource in a consistent sequence. Another
> > upside is that version number maintenance is automated.
> >
> > Option B's main disadvantage is that there is no real notion of
> > version that can be managed by the user agent. Another is that
> > deadlocks could occur because there is no a priori declaration of
> > intent about metadata modification. This could be remedied by
> > including the database itself in the list of objects that are intended
> > to be modified in the transaction.
> >
> > Option B's advantages are closer interleaving of and atomic metadata
> > changes with data changes, and application controlled version numbers
> > used for the database.
> >
> > Option C's disadvantage is that data and metadata changes cannot be
> > interleaved atomically.
> >
> > Option C's advantages are that deadlocks can be avoided and version
> > number management can be performed  by an application.
> >
> > Overall, I think version management and metadata changes are exclusive
> > in some sense. IOW, if we want Option B and Option C, then we have to
> > remove the connection time version check.
> >
> > Hope that helps. Please feel free to add if I missed anything.
> >
> > Nikunj
> >
> > On Nov 22, 2009, at 3:14 PM, Pablo Castro wrote:
> >
> >> We are finding a number of reasons for wanting to create tables on
> >> the fly, and without bumping up the database version. A few examples:
> >> - Packaged components that create side tables to maintain its own
> >> state
> >> - Query processors often need to "spill to disk" during query
> >> execution. For example, sorting large sets requires storing temporary
> >> sets of rows on disk to be merged later.
> >>
> >> So we're thinking it would be better to have these methods directly
> >> in the DatabaseSync/DatabaseAsync objects (with proper corresponding
> >> patterns), instead of their current location in the Upgrade
> >> interface.
> >>
> >> For the common case where several schema changes need to be done
> >> atomically, developers can simply wrap the calls in a transaction,
> >> and they would do for regular data manipulation.
> >>
> >> We would need an extra method to bump up the version explicitly, as
> >> that would no longer be in the upgrade callback.
> >>
> >> Does this seem reasonable?
> >>
> >> Regards,
> >> -pablo
> >>
> >>
> >
> > Nikunj
> > http://o-micron.blogspot.com
> >
> >
> >
> >
> 
> Nikunj
> http://o-micron.blogspot.com
> 
> 
> 
Received on Tuesday, 22 December 2009 23:53:15 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:35 GMT