[ServiceWorker] Cache transactions & fixing `addAll` (#823) from Jake Archibald on 2016-01-24 (public-webapps-github@w3.org from January 2016)

From: Jake Archibald <notifications@github.com>
Date: Sat, 23 Jan 2016 16:02:38 -0800
To: slightlyoff/ServiceWorker <ServiceWorker@noreply.github.com>
Message-ID: <slightlyoff/ServiceWorker/issues/823@github.com>
## What we need to solve

`cache.put`, `cache.add` and `cache.allAll` have transaction-like properties that aren't well explained by the spec. Eg, what happens in this case?

```js
const pa = cache.put(request, response);
const pb = cache.put(request, anotherResponse);
```

Should the second cache abort the first? If so what happens to the `pa` promise? Should `pb` wait until `pa` settles? This is not currently defined.

`cache.add` and `cache.addAll` are atomic, but you cannot ensure responses are ok, making it a bit of a footgun. Developers are surprised to hear this, despite it being consistent with `fetch()`. We should change `add` and `addAll` to reject if any of the responses are `!response.ok`, or if ok-ness cannot be determined (opaque response) - maybe include an option to revert to current behaviour.

This change is safe, as there the number of opaque responses being added via `add` & `addAll` rounds to 0.00%.

## Use-cases

Adding multiple resources to the cache atomically. Similar to:

```js
caches.open('foo').then(cache => {
  return Promise.all(
    ['/hello', '/world'].map(url => {
      return fetch(url).then(response => {
        if (!response.ok) throw Error("Response not ok");
        return cache.put(`${url}-cached`, response);
      });
    })
  );
})
```

The above isn't atomic, "/hello" can succeed & be written to the cache, but "/world" can fail.

Say I wanted to look into the cache and decide whether to delete entries based on their response:

```js
cache.open('foo').then(cache => {
  cache.keys().then(requests => {
    requests.forEach(request => {
      cache.match(request).then(response => {
        if (isTooOld(response)) {
          cache.delete(request);
        }
      });
    })
  });
});
```

The above is racy. The response we remove from the cache may not be `response`, as a `cache.put(request, differentResponse)` may have happened in a different thread.

## Proposal: cache transactions

Aims:

* Allow the above usecases
* Avoid excessive locking that would hurt performance
* Explain existing cache-writing methods in terms of transactions

```idl
TODO do IDL properly
interface CacheStorage {
  # …
  [NewObject] Promise<CacheTransaction> transaction(DOMString cacheName, optional CacheTransactionOptions options);
}

dictionary CacheQueryOptions {
  sequence<RequestInfo> limitTo;
};

interface CacheTransaction {
  void waitUntil(Promise<any> f);

  # Same method signatures as Cache, except add & addAll
  [NewObject] Promise<Response> match(RequestInfo request, optional CacheQueryOptions options);
  [NewObject] Promise<sequence<Response>> matchAll(optional RequestInfo request, optional CacheQueryOptions options);
  [NewObject] Promise<void> put(RequestInfo request, Response response);
  [NewObject] Promise<boolean> delete(RequestInfo request, optional CacheQueryOptions options);
  [NewObject] Promise<sequence<Request>> keys(optional RequestInfo request, optional CacheQueryOptions options);
}
```

### Examples

Multiple `.put`s atomically (from the usecases):

```js
const urls = ['/hello', '/world'];

Promise.all(
  urls.map(u => fetch(u))
).then(responses => {
  if (responses.some(r => !r.ok)) throw Error("Response not ok");
  const cacheURLs = urls.map(u => `${u}-cached`;

  return caches.transaction('foo', {
    // keep a minimal lock
    limitTo: cacheURLs
  }).then(tx => {
    cacheURLs.forEach((url, i) => {
      tx.put(url, responses[i]);
    });
    return tx.complete;
  });
});
```

Look into the cache and decide whether to delete entries based on their response (from usecases):

```js
caches.transaction('foo').then(tx => {
  tx.waitUntil(
    tx.keys().then(requests => {
      return Promise.all(
        requests.map(request => {
          return tx.match(request).then(response => {
            if (isTooOld(response)) {
              return tx.delete(request);
            }
          });
        })
      );
    });
  );

  return tx.complete;
});
```

`cache.put` and `cache.delete` become shortcuts to creating transactions, eg:

```js
Cache.prototype.put = function(request, response) {
  return caches.transaction(this._cacheName, {
    limitTo: [request]
  }).then(tx => {
    tx.put(request, response);
    return tx.complete;
  });
};

Cache.prototype.delete = function(request) {
  return caches.transaction(this._cacheName, {
    limitTo: [request]
  }).then(tx => {
    tx.delete(request);
    return tx.complete;
  });
};
```

And our modified `addAll` can be defined (roughly):

```js
Cache.prototype.addAll = function(requests) {
  return Promise.all(
    requests.map(r => fetch(r))
  ).then(responses => {
    if (responses.some(r => !r.ok)) throw TypeError('Nooope');
    return caches.transaction(this._cacheName, {
      limitTo: requests
    }).then(tx => {
      requests.forEach((request, i) => {
        tx.put(request, responses[i]);
      });
      return tx.complete;
    });
  });
}
```

### Detail

Some of this is a little rough. Apologies for lack of strictness, hopefully it gets the idea across.

* A **cache transaction** has an associated **locked requests**, a list of `Request`s, initially null
* A **cache transaction** has an associated **closed** flag
* A **cache transaction** has an associated **settled** flag
* A **cache transaction** has an associated **aborted** flag
* A **cache transaction** has an associated **operations** array of `CacheBatchOperation` dictionary objects
* A **cache transaction** has associated **extend lifetime promises**
* A **request to response map** as an associated **pending transactions**, a list of **cache transaction**s

`caches.transaction(cacheName, options)`:

1. If the result of running *is settings object a secure context* with the *incumbent settings object* is "Not Secure", return a new promise rejected with a `SecurityError` exception.
1. Let `p` be a new promise.
  1. TODO: type check, turn `options.limitTo` into requests
  1. TODO: Same as `caches.open`, aside from creating a `Cache`. Then once we have a **request to response map**…
  1. Let `pendingTransaction` be the result of running the **Request transaction** algorithm, passing in the **request to response map** and `options.limitTo`
  1. Let `cacheTransaction` be a
  1. Resolve `p` with a new `CacheTransaction` associated with the **cache transaction**
  1. Queue a microtask:
    1. If the **cache transaction**'s **extend lifetime promises** is not null
      1. Let `waitFor` be waiting for all of **cache transaction**'s **extend lifetime promises**
    1. Else
      1. Set `pendingTransaction`'s **closed** flag
      1. Let `waitFor` be a promise resolved with no value
    1. Wait for `waitFor` to settle
    1. Set `pendingTransaction`'s **closed** flag
    1. If rejected:
      1. Set `pendingTransaction`'s **aborted** flag
    1. If `pendingTransaction`'s **aborted** flag set
      1. Remove `pendingTransaction` from **request to response map**'s *pending transactions*
      1. Set `pendingTransaction`'s **settled** flag
      1. Reject `p` with `TypeError`
    1. Resolve `p` with the result of running **Batch Cache Operations** passing the `pendingTransaction`'s **operations** as the argument
1. Return `p`.

`cacheTransaction.match` & `cacheTransaction.matchAll` behave as they currently do, except:

1. If **closed** throw `TypeError`
1. If **locked requests** is not null, and request url is not equal to one of those in locked requests, throw `TypeError` (TODO: or do we allow it, but developer should be aware it's racy?)
1. TODO: …consider pending writes in **operations**?

`cacheTransaction.keys` behaves as it currently does, except:

1. If **closed** throw `TypeError`
1. If **locked requests** is not null, throw `TypeError` (TODO: or… same question as above)
1. TODO: …consider pending writes in **operations**?

`cacheTransaction.put(request, response)`

1. If **closed** throw `TypeError`
1. If **locked requests** is not null, and request url is not equal to one of those in locked requests, throw `TypeError`
1. As current `.put` steps 2-11 (TODO: check numbers), but also **abort transaction** before rejecting
1. Append `o` to **operations**
1. Return promise resolved with undefined

`cacheTransaction.delete(request)`

1. If **closed** throw `TypeError`
1. If **locked requests** is not null, and request url is not equal to one of those in locked requests, throw `TypeError`
1. As current `.delete` steps 1-9 (TODO: check numbers)
1. Append `o` to **operations**
1. Return `this.match(request).then(r => !!r)`

**Request transaction**:

Input

* `requestResponseMap`, a **request to response map**
* `lockedRequests`, a sequence of **request**s

Outputs `pendingTransaction`, a **pending transaction**

1. Let `currentPendingTransactions` be a copy of the **pending transactions** associated with `requestResponseMap`
1. Let `pendingTransaction` be a **cache transaction**
1. Set `pendingTransaction`s **locked request** to `lockedRequests`
1. Add a new `pendingTransaction` to `requestResponseMap`'s **pending transactions**
1. For each `pendingTransaction` in `currentPendingTransactions`:
  1. If `lockedRequests` is undefined, or if any of `lockedRequests` have the same url as one or more of the requests in `pendingTransaction`'s **locked requests**
    1. Wait for `pendingTransaction`'s **settled** flag to become set
1. Return `pendingTransaction`

**Abort transaction**:

* `cacheTransaction`, a **cache transaction**

Outputs none

1. Set `cacheTransaction`'s **aborted** flag
1. Reject `cacheTransaction`'s **extend lifetime promises**

**Batch Cache Operations**

Similar to as it is now, except:

* Performs squashing to remove redundant operations
* Resolves once data fully written into cache, and reverts if writing fails (similar to what the current `.put` is doing in step 14 TODO check numbers)
* Deletes should be reverted if puts fail - I think we should be able to perform the deletes after all the puts once correctly squashed

## Questions

Creating a transaction:

```js
caches.transaction(cacheName, options).then(tx => {
  tx.waitUntil(promise);
  return tx.complete;
}).then(() => console.log("Transaction complete!"));

// or…

caches.transaction(cacheName, tx => {
  return promise;
}, options).then(() => console.log("Transaction complete!"));
// No need for waitUntil or tx.complete
// Easier to reason about transaction lifetime? (less like IDB)
// Can detect throws in callback execution
// Using a callback feels like an antipattern

// or even…

cache.open(cacheName).then(cache => {
  cache.transaction(tx => {
    return promise;
  }, options).then(() => console.log("Transaction complete!"));
});
// One method of opening a cache
// Really easy to deadlock by returning `cache.put(…)` in the transaction callback
```

`cacheTransaction.put` could return synchronously, is it better to return a promise for consistency?

Should read operations like `cacheTransaction.match` consider pending tasks (**operations**) in their return values?

I guess `cacheTransaction.abort` makes sense.

Apologies for both the length and hand-waving in this proposal, but I'd rather not spend any more time on it if we think it's a no-go.

---
Reply to this email directly or view it on GitHub:
https://github.com/slightlyoff/ServiceWorker/issues/823
Received on Sunday, 24 January 2016 00:03:19 UTC