[whatwg] Gigantoredesignorrific changes to the Database API from Ian Hickson on 2007-10-24 (public-whatwg-archive@w3.org from October 2007)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 24 Oct 2007 10:56:41 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0710240823380.16360@hixie.dreamhostps.com>
Please excercise caution when replying to this e-mail, as it has been 
cross-posted to both the WHATWG and the public-html mailing lists. To 
avoid difficulties, please reply just to one or the other.

I have heavily modified the database API to address a number of 
fundamental issues people have raised.

I have not supplied a way to do a statement with a transaction in this 
version of the API, though the new API is much more able to be extended to 
support that without ending up with weird method names.

I haven't supplied a way in this version of the API to rollback or commit 
a transaction without executing the whole callback (i.e. there's no 
commit() or rollback() method), but the changes again make it much easier 
for us to extend the API later to allow this.

I also made no attempt to address the concern that a callback that did a 
lot of work (a common scenario if you consider building up a table from a 
SELECT's result set to be a lot of work) would hold up the end of a 
transaction.

More details below.

You can find the new spec at:

   http://www.whatwg.org/specs/web-apps/current-work/multipage/section-sql.html


On Tue, 16 Oct 2007, Scott Hess wrote:
> > >
> > > Whoa!  I just realized that there's another group, constraint 
> > > failures. These are statements which will sometimes succeed, 
> > > sometimes fail.  As currently spec'ed, it looks like a constraint 
> > > failure will cause rollback.  This is probably not appropriate, 
> > > constraint failures are generally expected to be benign and 
> > > detectable.  Arguing against myself, I've seen plenty of code which 
> > > just treats any errors from something which allows for a constraint 
> > > failure as a constraint failure, so maybe this also can be lumped 
> > > under a single big error code.
> >
> > Could you elaborate on this? What would code running into this 
> > situation look like? How should we expose it?
> 
> A common case is wanting to update a row, if present, or insert it, if 
> not.  You can handle this in three ways.  You can use a transaction to 
> keep things consistent:
> 
>   BEGIN;
>     SELECT COUNT(*) FROM t WHERE id = ?;
>     -- if == 0
>       INSERT INTO t VALUES (?, ...);
>     -- if == 1
>       UPDATE t SET c = ?, ... WHERE id = ?;
>   END;

With the new API:

   var db = openDatabase('test', '');
   db.transaction(function (tx) {
     tx.executeSql('SELECT COUNT(*) AS count FROM t WHERE id = ?', [id],
       function (tx, results) {
         if (results.rows[0].count == 0)
           tx.executeSql('INSERT INTO t (id, c) VALUES (?, ?)', [id, data]);
         else
           tx.executeSql('UPDATE t SET c = ? WHERE id = ?', [data, id]);
       });
   });


> This style is generally avoided, because in a server environment, you
> have four round trips from when the transaction is opened to when it's
> closed, plus whatever contention for CPU is present at both ends, so
> it is not great for concurrency.  Instead, you can just try the insert
> and rely on a unique or primary key to cause a constraint violation:
> 
>   INSERT INTO t VALUES (?, ...);
>   -- if constraint violation on id
>     UPDATE t SET c = ?, ... WHERE id = ?;

With the new API:

   var db = openDatabase('test', '');
   db.transaction(function (tx) {
     tx.executeSql('INSERT INTO t (id, c) VALUES (?, ?)', [id, data],
       function (tx, results) {},
       function (tx, error) {
         if (error.code == 6) {
           tx.executeSql('UPDATE t SET c = ? WHERE id = ?', [data, id]);
           return false;
         }
         throw error;
       });
   });


> That variant is best if you expect to usually succeed, and sometimes
> fall through to the update (for instance when inserting a new user
> record).  Otherwise, you could do:
> 
>   UPDATE t SET c = ?, ... WHERE id = ?;
>   -- if no rows affected
>     INSERT INTO t VALUES (?, ...);

With the new API:

   var db = openDatabase('test', '');
   db.transaction(function (tx) {
     tx.executeSql('UPDATE t SET c = ? WHERE id = ?', [data, id]);
       function (tx, results) {
         if (results.rowsAffected == 0)
           tx.executeSql('INSERT INTO t (id, c) VALUES (?, ?)', [id, data],
       });
   });


On Wed, 17 Oct 2007, Maciej Stachowiak wrote:
> 
> I can think of two reasons you might not want to open a transaction in a 
> particular case.
> 
> (1) Behavior - you may honestly want to start a completely independent 
> statement from the callback for another. This case seems to be handled 
> ok by closeTransaction(), though perhaps a little inconveniently if this 
> turns out to be the common case.

In the new spec you can start a new transaction just by calling 
transaction(), so that's catered for. (It doesn't start a nested 
transaction, it starts an independent one.)


> (2) Performance for single statements. We should gain implementation 
> experience to determine if, in likely implementations, it is a 
> significant performance improvement for single statements to be executed 
> without opening a transaction at all.

I have, for now, not included an API for this. Please let me know when you 
have more experience with this in the wild if such an API would help. The 
current API is easily extended to allow for this.


> (3) Performance in the face of concurrency by not holding locks as long. 
> But again closeTranscation() can cover this case if (2) is not an issue.

With the removal of closeTransaction() the spec actually makes this harder 
now. However, you can work around it with setTimeout(..., 0) if it is 
critical. It would be relatively easy to address this in future versions 
by adding a commit() method to the SQLTransaction object.


On Wed, 17 Oct 2007, Brady Eidson wrote:
>
> I imagine that *any* potential back end will have a hard time optimizing 
> the "1 statement in a transaction" case to be as fast as the statement 
> by itself.
> 
> Is this mandatory performance hit acceptable?

I don't know.

A number of other people made detailed comments on the performance 
characteristics of various implementation strategies and API designs, 
which I have elided here. It seems that the issue can be argued all three 
ways (transactions are a win, raw statements without transactions are a 
win, the difference is insignificant), and so I have mostly let other 
concerns drive the design of the API. Do let me know if you think I have 
not addressed something that you think is important.


On Wed, 17 Oct 2007, Adam Roben wrote:
> 
> 1. Should single SQL statements be wrapped in a transaction?
>
> 2. Should the SQL API support explicit transactions (and therefore a way 
> to not fall into implicit transactions)?

The new spec makes all transactions explicit, which makes #2 somewhat 
moot. There's no way to avoid #1 at the moment but we can address that in 
a future version relatively easily. As previously mentioned, I'd like to 
keep this API as thin as possible in the first version.


On Wed, 17 Oct 2007, Scott Hess wrote:
> 
> A) As things currently stand, the developer simply can't roll their own 
> transaction structure.  Passing BEGIN, COMMIT, or ROLLBACK to 
> executeSql() doesn't do anything sensible.  It's possible you could 
> somehow do something using temporary tables, but that's going to be 
> really dependent on your underlying SQL implementation's capabilities.

This is still the case.


> B) Transactions are a meaningful concept in SQL, and we're entangling 
> that concept with an implementation detail of how the API is 
> implemented.

Indeed.


> I'm gradually getting to where I don't feel strongly about B.  If you 
> don't want a transaction, you can have serial calls to executeSql(). If 
> you want serial calls that are dependent, you can call 
> closeTransaction() before making the next executeSql().  I agree that 
> the following might be more self-documenting:
> 
>   db.transaction(function () {
>     db.executeSql(...);
>   });

That's basically what we have now (though executeSql is on a transaction 
object passed to the callback).


> On A, I'm still nervous.

Can you elaborate on your concerns?


On Wed, 17 Oct 2007, Brady Eidson wrote:
> 
> I'm really starting to fall in to your way of thinking on this: A is 
> what makes me nervous and I think it largely complicates both the 
> implicit transaction and changeVersion() issues.
> 
> "A future version of this specification may define the exact SQL subset 
> required in more detail" - perhaps this future version of the spec might 
> also specifically disallow begin/commit/rollback in executeSql(), which 
> would be okay if we make the built-in alternative clear.

I've disallowed BEGIN/COMMIT/ROLLBACK.


On Wed, 17 Oct 2007, Brady Eidson wrote:
> 
> Additionally, if we replaced closeTransaction() with commitTransaction() 
> and rollbackTransaction(), that would fit in with my idea of disallowing 
> BEGIN/COMMIT/ROLLBACK in executeSql() as the developer would still have 
> manual control over the implicit transaction.
> 
> I'm very interested to hear everyone's thoughts on this.

The spec does have a way of controlling what gets commited and what gets 
rolled back now (you throw to force a rollback; you return false from your 
transaction error handler to ignore the error and force a commit if 
possible).


On Wed, 17 Oct 2007, Aaron Boodman wrote:
> 
> In the current spec, the tx must stay open until callback returns, 
> right? That seems like the only reasonable way it could work. That's too 
> bad because since the callback is also overloaded as a way to get the 
> results of executeSql and do work, the callback might stay open for a 
> long time and do all sorts of other work, like starting httprequests, 
> starting new logical sql transactions, updating the UI, etc.

Indeed. Other than providing explicit commit() and rollback() methods on 
the SQLTransaction object (which I think we should do only in version 2), 
I don't see how to work around this.


> This is the first thing that makes me question the current implicit 
> transaction API :-(. Maybe adding a executeSqlInTransaction() would be 
> smarter. You can then separate the two meanings of SQLCallback:
> 
> - getting results of a sql call
> - linking sql calls together into transactions

Could you elaborate on how this would work? How would the UA know when to 
commit?


On Wed, 17 Oct 2007, Scott Hess wrote:
> 
> Since opening an SQL transaction is itself an SQL call which may fail, I 
> think it would be more reasonable to just have two calls, executeSql() 
> which simply passes the SQL statement down to the driver, and 
> transaction() which provides a transaction context in the callback.  If 
> executeSql() is called within the transaction() callback, that extends 
> the transaction into the newly registered callback.  If transaction() is 
> called within a callback, that opens a nested transaction (*) and 
> extends the containing transaction. Successfully completing the entire 
> tree of callbacks results in commit, while an uncaught exception causes 
> rollback (**).

This is basically what the spec does now, modulo the nested transactions.


> (*) This could either be a true nested transaction, if your SQL driver 
> supports that, or a fake transaction using a stack to track nesting 
> (with the consequent inability to rollback a nested transaction without 
> rolling back through the outermost transaction).
> 
> (**) I'm not sure how real nested transactions would work, here. Maybe 
> you have to catch exceptions from the nested transaction() call?

Nested transactions seemed like more effort than they were worth, 
especially given lack of support in common DB libraries like SQLite. I've 
made transaction() simply open a new transaction unrelated to any parent 
transaction.


On Wed, 17 Oct 2007, Scott Hess wrote:
> 
> It might, but I'd still wonder how far the spec wants to go down the 
> road of replicating/enforcing the SQL transaction model.  I think having 
> convenience functions is helpful to developers, it's the notion of 
> having _only_ the convenience functions which concerns me.  The more the 
> API adds things like implicit transactions and commit/rollback 
> functions, the more tightly tied it will be to a specific SQL 
> implementation's semantics.

The spec will eventually be completely tied to a specific SQL 
implementation's semantics, so I don't see why this is a problem.


> A thought experiment we've used on the Gears team is to ask whether 
> something can be composed from more basic components entirely in 
> JavaScript.  Posit a version of the API which instead implements 
> executeRawSql(), which does not have any transaction implications at 
> all.  The spec's executeSql() could clearly be implemented as a wrapper 
> around executeRawSql(), as could the current closeTransaction() and the 
> above commitTransaction() and rollbackTransaction().  executeRawSql() 
> also allows the developer to merrily shoot themselves in the foot, of 
> course.

I agree that transaction support could be implemented on top of raw calls 
and multiple databases. However:

> [I think where I'm going here might be "make easy things easy and hard 
> things possible".]

I think the above argues for us to take the API just that one level above 
the raw SQL up to having "first-class" transaction support in the API.


On Wed, 17 Oct 2007, Aaron Boodman wrote:
> 
> Yes this is an icky bit. I am almost ready to say that 
> BEGIN/COMMIT/ROLLBACK should be blacklisted, except for the fact that I 
> can see somebody wanting to programatically rollback in the space 
> between two sql statements and the only way without ROLLBACK is to throw 
> an error.

I haven't made this possible in this version. I have left us the option of 
providing a rollback() method in a future version that would make this 
easy, though.


On Wed, 17 Oct 2007, Brady Eidson wrote:
> 
> I've seen about 10 different interpretations of the implicit transaction 
> and the effect of closeTransaction() - I don't think it's clear yet ;)

How about now? :-)


> > What are the disadvantages of implicit transactions?
> 
> The above case with sql1-3 and sql4-6 - what would happen?  Would the 
> sql4-6 calls wait until the sql1-3's complete or would they error out? 
> The spec is ambiguous on this *because* of the implicit transaction.

I don't really see how it's the implicitness that makes it ambiguous. It's 
inherently a race condition, no?


> Also, the "implicit transaction" from the JS API's standpoint turns into 
> an "explicit transaction for every individual statement" in SQL reality.  
> This is almost certainly going to translate into a lock held for a 
> longer period of time or more code execution for a single statement.  
> If a common use-case turns out to be individual statements one at a 
> time, we're inflicting a mandatory penalty on the web developer.

Yes.


> Additionally, I sent an email a bit over a week about about the 
> complexities of changeVersion() along with the implicit transactions.  
> If there is one of these transactions open, how do we rectify 
> changeVersion() blocking the JS thread when a callback on the JS thread 
> would be needed to address closing out the mandatory implicit 
> transaction?

changeVersion() is now asynchronous (and actually invokes a transaction 
just like transaction() does).


> > That's a possibility, though we're trying to keep the API as thin as 
> > possible.
> 
> An admirable goal - one that I agree with.  Which is why I think the 
> wisdom of the implicit transaction is dubious.  Developers that will be 
> using SQL will know they can say "BEGIN TRANSACTION;" and "COMMIT;" or 
> "ROLLBACK;" so the utility of having transactions will not be lost.  
> Ditching it would help thin the API further, clearing up this confusion 
> and complexity.

Well, the spec now blacklists those, forcing the API to be used for 
transactions. In theory this makes things better, for example by making it 
much harder to forget to close a transaction.


On Wed, 17 Oct 2007, Maciej Stachowiak wrote:
> 
> Downsides to this approach:
> 
> - You could only have one transaction in flight at once, so you'd have 
> to do scheduling in the app code if a transaction-starting UI operation 
> happens while you already have a transaction in progress. Otherwise 
> multiple transactions would get scrambled. (Or else the API layer could 
> parse your statements and understand when you have opened a transaction 
> to still implicitly assign statements in your callbacks to the 
> transaction, but I am not sure this is a simplification overall.)

Indeed. This is addressed in the new API by having an explicit 
SQLTransaction object per transaction.


> - An author mistake (like doing something that causes an exception in 
> the callback) means a stuck lock, quite possibly ruining the whole 
> session.

Indeed, that's the main reason for having transactions at the API level, 
IMHO.


> With a synchronous API and threads this wouldn't be a problem, because 
> we could provide a wrapper function that would bracket your code with 
> BEGIN TRANSACTION and the appropriate of ROLLBACK or COMMIT depending on 
> whether you throw an exception, and each thread would be using a 
> different database connection. But with the async API, you create much 
> more opportunity for author error.

Indeed, when we introduce worker pool APIs I think we'll have to introduce 
a synchronous version of this API.


> I think the current model is not really as hard to understand as it 
> might seem from the spec, which has to be very precise for the sake of 
> implementations and does not make for a good tutorial.

Yeah, I'll add an intro section once we're settled on something.


> We should test whether the performance benefits of not using 
> transactions are significant. If we need to provide both I might suggest 
> startTransaction or startSqlTransaction that would act like the current 
> executeSql, and executeSql which acts as currently if there is a current 
> transaction, but doesn't start one if none is open.

We now have transaction() on the Database and executeSql() on the 
SQLTransaction -- we can easily add an executeRawSql() method on the 
Database object at some future point if we find it is needed.


On Wed, 17 Oct 2007, Aaron Boodman wrote:
> 
> Another problem is that developers actually don't realize they need to 
> use BEGIN and COMMIT and they end up writing extremely slow code and 
> wondering why. Making the transactions be automatic is a big win for 
> making the API performant by default.

Right, this is one reason that the API now requires use of transactions, 
at least in this version.


On Wed, 17 Oct 2007, Scott Hess wrote:
> 
> I agree with Aaron.  Having transactions explicitly in the API is 
> useful, because langauge features can be integrated (for instance, 
> unhandled JavaScript exceptions can result in a ROLLBACK), and because 
> it's very easy to make a mistake that locks things up.

Indeed.


> Here I'm mixed.  There are multiple reasons to use transactions.  The 
> most primary use is for correctness reasons.  Using them to improve 
> performance is an implementation detail, in a high-concurrency system a 
> bunch of bare statements is likely to allow more performance than the 
> same statements in a transaction.  I'm concerned that making 
> transactions implicit to address an implementation detail like 
> performance may cause unforseen correctness issues.

Do you like the new explicit model?


> How would the current system interact with nested transactions?  The 
> obvious answer is that each executeSql() would create a new nested 
> transaction, but if you logically want a nested transaction, and want to 
> do a ROLLBACK, you want to ROLLBACK the logical nested transaction, not 
> the one associated with the enclosing executeSql().  I was thinking you 
> could call closeTransaction() [which would COMMIT the immediately local 
> executeSql()], then throw an exception to cause a ROLLBACK, but as 
> currently spec'ed I think that will poison the entire transaction.

The current spec has no nested transaction support whatsoever.


> I think I generally agree with Maciej, but differ that I think the 
> conservative stance would be to not introduce implicit transactions and 
> later add them, rather than introduce them and later add a way to not 
> have them.  I do think real experience will be useful, because combining 
> transactions and asynchronous operation makes it hard to see through to 
> which issues are real.  It may be that convoluted nested-transaction 
> systems are just impossible to understand when phrased asynchronously, 
> so making the API work with them is not valuable.

If this turns out to be the case, we can add explicit SQL calls and 
downplay or deprecate the transaction API. It is harder to go the other 
way (trying to convince people not to use a raw API when there's a better 
one is hard, people will still use the other one).


On Wed, 17 Oct 2007, Brady Eidson wrote:
> 
> There is a problem that has been alluded too in this thread but not 
> explicitly called out.
> 
> For all the tracking of implicit transactions, having "active thread 
> global transactions" in an executeSql() chain, and have changeVersion() 
> wait until all transactions are closed... how do we handle web 
> developers who start their own transactions?

It's not possible in the new spec.


On Wed, 17 Oct 2007, Maciej Stachowiak wrote:
> 
> I think the remaining problem is that you can't make the version change 
> atomic with the transaction you use to actually upgrade the schema. This 
> could be fixed by making changeVersion() open a transaction which is the 
> current transaction during its callback, with the requirement that the 
> version is automatically rolled back if the transaction is. Then you can 
> do the actual schema upgrade from changeVersion()'s callback. I believe 
> this is reasonable to implement and would make database upgrades more 
> sound.

I have done this.


On Wed, 17 Oct 2007, Brady Eidson wrote:
> 
> I share Maciej's concern here.  Right now the only alternative is to 
> make sure you're not in the middle of any transactions, perform an 
> executeSql() chain in one transaction that changes the database schema, 
> and immediately follow up that chain with a changeVersion().  Problem is 
> that another browsing context can still butt-in between your 
> schema-mutating-transaction and your call to changeVersion().  What if 
> that other browsing context changes the schema in a different way, then 
> you change the version, but not to what you would expect?

In the previous model I expected people to change the version to a 
temporary version, do the work, then change it to the final version. This 
worked around the above problems, but was still very brittle -- what if 
the script dies while in the temporary version state? The database becomes 
unusable.

The new model fixes this.


On Wed, 17 Oct 2007, Aaron Boodman wrote:
>
> I'm still not convinced that closeTransaction() is worth the trouble.

It's gone.


> My point of view is that database errors are rare and unexpected and 
> will be handled with a separate code path from the success path. In 
> simple cases, they need not be handled at all, throwing a global error 
> (which goes to window.onerror, then the error console) is sufficient. In 
> more complex cases, the db transaction may have been part of a larger 
> operation, like synchronization, which needs to be aborted and handled 
> gracefully in the UI. Still, in that case, the code for handling errors 
> is totally different than the code for handling success, and it 
> definitely does not need to do anything inside the failed transaction.

I haven't yet made errors go to window.onerror, but other than that, I've 
done what you suggest -- transaction(), changeVersion(), and executeSql() 
all have two callbacks, one for errors and one for success.

I'm not sure exactly what to do with onerror. Should I raise an exception 
as if the transaction() call had failed? (i.e. what line number do I use? 
What exception?)


On Thu, 18 Oct 2007, Scott Hess wrote:
>
> If the statement to executeSql() is invalid, then an exception will be 
> raised immediately, which can be caught by wrapping the call to 
> executeSql() with an exception handler.  If there is an error in the 
> course of executing the statement, it will be exposed by the errorCode 
> in the callback.  If there is no such error, the callback will be 
> executed in the context of an implicit database transaction.

I've changed this so that all errors fire the error callback. executeSql() 
will no longer ever raise an exception. This should make error handling 
easier to manage.


> This would appear to leave no place from which to detect an error 
> committing the transaction.

I've added an error callback to transaction() to handle this case. It also 
gets called when an error happens in any of the executeSql() calls that 
isn't handled by those error handlers.


> There is also no place to hang code to execute when the transaction is 
> known to be successful.

I haven't added anything for that. Do we really want a third callback? 
What's the use case? (Updating UI? Can't you update the UI optimistically 
and then fix it up in case of error?)


On Thu, 18 Oct 2007, Aaron Boodman wrote:
> 
> Good one. Another reason I guess you need to separate the concept of 
> finishing a transaction from the concept of finishing a statement.

It's all kinda still implicit right now, I have to admit.


On Thu, 18 Oct 2007, Mihai Sucan wrote:
> > 
> > You also asked some questions about the globalStorage stuff and about 
> > file upload APIs. I'll answer those later when I address those areas.
> 
> No problem. I know I went off-topic, but I allowed myself to do that, 
> because ideas are most important.

Yeah; it was no problem at all, it's just that I'm not addressing those 
areas at the moment! Don't worry about mixing feedback, it's fine, so 
long as you can handle replies to different parts at different times!


> > > 1. Why complicate things with database versions? Any use cases?
> > 
> > It's for when you update your application and need to transition to a 
> > new database schema.
> 
> Can't this be done with a completely new database name?

This would require copying all the data over each time you update the 
schema.


> The spec definition of what a database 'version' is, is not very clear. 
> Let me see if I correctly understand now:
> 
> openDatabase('robodesign', 'v1')
> 
> ... this creates the database 'robodesign' with version 'v1'.
> 
> ... we can executeSql() as much as we want
> 
> openDatabase('robodesign', 'v2')
> 
> ... this fails (INVALID_STATE_ERR), because the database doesn't have 
> version 'v2'
> 
> openDatabase('robodesign', '')
> 
> ... this works - no expected version.
> 
> ... but still, we cannot executeSql() because the algorithm tells that 
> the expected version must be te same as the current version.

All right except that if you opened it with '' then you can do anything.


> db.changeVersion('v1', 'v2', callBack)
> 
> ... allows the author to implement the transition from older versions to 
> newer versions of the same database.

Right.


> Now, the question is: how are authors supposed to make the transition between
> versions?
> 
> a) within the callback of changeVersion()?
> 
> The author is not supposed to start running any commands which 
> transition the database to the newer version. Everything is supposed to 
> be done within the callback.

Right.


> If my understanding of what the database version is correct, here's my 
> try at a simpler "overview" definition:
> 
> "Any database can have only one version at a time - versions are not 
> concurrent. The database version only serves the purpose of easier 
> identification of the version for Web authors. They can write scripts 
> which automatically upgrade the database of the client to a newer 
> version. Without any version identifier, the authors would not be able 
> to trivially check which database schema is provided by the client."

I've added a note with the same spirit which might help.


> > > 2. Why openDatabase() instead of executeSql('SELECT databaseName') ?
> > 
> > Because you're not executing SQL, you're selecting the database. I 
> > don't really see why you'd use executeSql for this.
> 
> Erm, that's a mistake. I wanted executeSql('USE database_name'). 
> However, given the special purpose of openDatabase() ... the USE command 
> cannot be used.

I'm not sure why you'd want to use the USE command...

In any case, with the new API this becomes moot, since there is no way to 
execute SQL until after you have a transaction, which you can't do until 
you have a database.


> Further comments on the section:
> 
> 1. The changeVersion() definition does not tell what happens if the 
> author invokes the method with the first two arguments being the same. 
> It also does not tell what happens when the newVersion is the same as 
> the current version.

It doesn't mention these explicitly, but I believe the spec does define 
what should happen in those cases, no? The same rules apply as in any 
other case.


> 2. The changeVersion() method should also provide the Database object to 
> the callback function, as the first argument. The second argument should 
> be the boolean value determined by the algorithm defined for the 
> changeVersion() method.

changeVersion() now passes a SQLTransaction to the callback.


> 3. In step 5 of the executeSql() algorithm it is said:
> 
> "The user agent must then add the specified SQL statement to 
> transaction, and must execute it as soon as all the statements that were 
> added to that transaction before it have themselves successfully 
> executed. [SQL]
> 
> If the Database object has an expected version that is neither the empty 
> string nor the actual version of the database, the statement must fail."
> 
> The UA must execute the SQL statement. However, if the expected version 
> is not the same as the current version, *then* the statement must fail. 
> How does this work? Shouldn't the UA skip the execution of the SQL 
> statement if the two versions don't match, and simply fail?

The new text should make this clearer. Let me know if it's still 
ambiguous.


> 4. Based on the definition of the openDatabase() method I understand 
> that the expected version can *only* be the empty string *or* the actual 
> value of the *current* version of the database. Is that correct? If yes 
> (or not) this should also be made more clear.

No, the current version can get changed by changeVersion() in another 
thread. I'm not sure how to make that clearer.


> 5. In the list of error codes, maybe an error code should be defined for 
> one of the most common errors: duplicate ID. The error is triggered when 
> the author tries to execute an SQL statement which inserts a new row 
> with an existing ID - duplicate ID.

Added.


On Fri, 19 Oct 2007, Timothy Hatcher wrote:
> 
> I think step one needs more clarification on what an invalid statement 
> is. In the current WebKit implementation a SYNTAX_ERR is being thrown 
> for all queries that sqlite3_prepare doesn't return SQLITE_OK. This 
> means queries against tables that don't exist will throw a SYNTAX_ERR, 
> among other things that are perfectly well-formed statements.

This should be a non-issue now that all errors use the error callback 
instead of throwing exceptions.


> Take this example and assume WebKitNotes doesn't exist:
> 
> db.executeSql("CREATE TABLE WebKitNotes (note TEXT, timestamp REAL)", [],
> function(result) {});
> db.executeSql("INSERT INTO WebKitNotes VALUES (?, ?)", [note, timestamp],
> function(result) {});
> 
> The first query will be queued for execution. It is a valid statement. 
> The second query can come in before the first has been executed. Since 
> the table hasn't been created yet, executeSql will throw a SYNTAX_ERR 
> because sqlite3_prepare fails to find the table.
> 
> Now, run the code again and the create table will throw SYNTAX_ERR 
> because the table already exists. Users could use CREATE TABLE IF NOT 
> EXISTS, but SQLite has not always had this (for instance, the version 
> shipped on Mac OS 10.4.)
> 
> So a developer needs to write their code like this:
> 
> try {
> 	db.executeSql("CREATE TABLE WebKitNotes (note TEXT, timestamp REAL)",
> [], function(result) {
> 		db.executeSql("INSERT INTO WebKitNotes VALUES (?, ?)", [note,
> timestamp], function(result) {});
> 	});
> } catch(e) {
> 	db.executeSql("INSERT INTO WebKitNotes VALUES (?, ?)", [note,
> timestamp], function(result) {});
> }
> 
> I think SYNTAX_ERR should only be thrown for non well-formed queries 
> (missing quotes, clauses, etc.), and if a table doesn't exist it should 
> not throw. Once the whole queue is executed, that table might exist. If 
> the table still doesn't exist the developer will have an errorCode and a 
> meaningful error string instead of a generic SYNTAX_ERR with no context.

In the new API the above would translate (bugs and all) directly to:

   var db = openDatabase('test', '');
   db.transaction(function (tx) {
     tx.executeSql('CREATE TABLE WebKitNotes (note TEXT, timestamp REAL)', [],
       function (tx, results) {
         tx.executeSql("INSERT INTO WebKitNotes VALUES (?, ?)",
                       [note, timestamp]);
       },
       function (tx, error) {
         tx.executeSql("INSERT INTO WebKitNotes VALUES (?, ?)",
                       [note, timestamp]);
         return false;
       });
   });

...or you could do:

   var db = openDatabase('test', '');
   db.transaction(function (tx) {
     var insert = function (tx) {
       tx.executeSql("INSERT INTO WebKitNotes VALUES (?, ?)", [note, timestamp]);
       return false;
     };
     tx.executeSql('CREATE TABLE WebKitNotes (note TEXT, timestamp REAL)',
                   [], insert, insert);
   });

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 24 October 2007 03:56:41 UTC