- From: Nikunj R. Mehta <nikunj.mehta@oracle.com>
- Date: Fri, 24 Jul 2009 13:54:27 -0700
- To: Ian Hickson <ian@hixie.ch>
- Cc: public-webapps WG <public-webapps@w3.org>, Laxmi Narsimha Rao Oruganti <Laxmi.Oruganti@microsoft.com>
On Jul 23, 2009, at 6:24 PM, Ian Hickson wrote: > On Thu, 16 Jul 2009, Nikunj R. Mehta wrote: >> >> The spec should not restrict implementations to any one level of >> concurrency unless there are specific undesirable effects. >> >> Restricting the database to a single writer means that if there are >> separate >> workers or background threads working to update non-overlapping >> portions, then >> they have to wait for the lone current writer. Implementations can >> certainly >> compete to produce the level of concurrency that developers need. >> Specifically, I propose that the following text >> [[ >> If the mode is read/write, the transaction must have an exclusive >> write lock >> over the entire database. If the mode is read-only, the transaction >> must have >> a shared read lock over the entire database. The user agent should >> wait for an >> appropriate lock to be available. >> ]] >> >> be replaced with the following text >> >> [[ >> Multiple read-only transactions may share the same data as long as >> there is no >> transaction attempting to write the data being read. The user >> agent must wait >> for transactions that are reading some data before allowing a read/ >> write >> transaction on the same data to continue. >> ]] > > Since there's no way for the author to say ahead of time which rows or > cells the transactions are going to use, how can you do the above > without > ending up with some transactions failing unpredictably? > Database developers (whether experienced DBAs or newcomer WebApp programmers) identify the data set they are using through statements they execute (within or outside transactions). It is the database's job to find out which records are being used. The concepts of transaction processing apply no matter the granularity of a data item, whether it is a record or a disk block, or a whole file. There are many kinds of failures (and yes, failures are always unpredictable) [1]. Let's focus on failures arising from concurrency control enforcement, which is probably the one most people worry about from a programming perspective. In the following discussion, I use the term locking , even though other protocols have been developed and are in use, to guarantee serializability, i.e., correct interleaving of concurrent transactions. A knowledgeable database programmer would read the smallest set of data in a transaction so as to avoid locking the entire database for concurrent operations. Moreover, this approach also minimizes starvation, i.e., the amount of time a program would need to wait to obtain permission to exclusively access data. Transactions can fail even if locking occurs at the whole database level. As example, consider the situation: 1. A read-only transaction is timed out because some read-write transaction went on for too long. 2. A read-write transaction is timed out because some read-only transaction went on for too long. 3. A read-only transaction includes inside it a read-write transaction. However, it was running concurrently with other read-only transactions. Other read-only transactions do not complete before the time out of the read-write transaction and the read-write transaction fails. Experience has shown that there is no easy way out when dealing with transactions, and locking at the whole database level is no solution to failures. I hope that answers your question. Nikunj http://o-micron.blogspot.com [1] Fundamentals of Database Systems, 4th ed, Elmasari and Navathe, Section 17.1.4, 2003
Received on Friday, 24 July 2009 20:56:53 UTC