[WebSimpleDB] Introduce a pause/resume pattern for coordinated access to multiple stores

Whenever we take a callback that's to be called for each item in a set (e.g. with a .forEach(callback) pattern), we need a way to indicate the system whether it's ok to move to the next row and invoke the next callback or not. Otherwise, in scenarios where the callback itself performs an operation that doesn't finish immediately (such as another database async call) the system will keep queuing up top-level callbacks, which in turn may queue up more callbacks as part of its implementation, and execution will be in "some order" that's very hard to predict at best.

This comes up in several contexts. Applications will often need to scan more than one object store in coordination. Query processors will also need this when implementing physical operators for joins and such. A different context would be a system that needs to submit an HTTP request per row, where you may want to use an XmlHttpRequest and unwind after calling open. While the HTTP request is in flight you don't want to move to the next

In most cases one of the key aspects is that we need separate components to work cooperatively as they pull rows from one or multiple scans, and there needs to be a way of controlling the advance of cursors through the rows.

We would like to introduce "pause" and "resume" functions for scans to support this. Since there is no obvious place to put this right now, we could introduce an "iterator" object that can be used to control things related to the current state of the iteration as of when the callback happens, or maybe this is the cursor itself.

The resulting code would look like this (the example uses the single-async-level pattern we're playing around, but these two are actually independent things):

async_db.forEachObjectInStore("people", function(person, iteration) {
  iteration.pause(); // we won't be done with 'person' until later...
  var request = async_db.getFromStore("people", person.managerId);
  request.onsuccess = function() {
    var manager = request.result;
    // Do something with both 'person' and 'manager', and now we're ready to process the next person.
    iteration.resume();
  };
});

The nice thing about adding these as methods on the side is that it's completely out of sight in simple scenarios where you may be just scanning to build some HTML for example. Only if you're doing multiple coordinated, async tasks you need to know about these functions.

Regards,
-pablo

Received on Wednesday, 23 December 2009 01:40:27 UTC