- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 21 Sep 2007 10:36:07 +0000 (UTC)
Based on discussions in response to previous proposals in this and other forums, I have written a new proposal for offline web app APIs. The idea here is to allow existing applications to be retrofitted with a script layer that enables the applications to work offline, while supporting multiple instances running different versions, being safe against buggy author-provided client and server code, allowing for both replacement and transparent updates, not having any increase in the page load time in any case, and being reasonably easy to develop for. Note: Some people in public-html at w3.org suggested that this would be an area that we should not be working on. However, two separate browser vendors as well as a browser extension developer group have independently contacted me requesting support for this _in person_, as well as by e-mail and over IRC, and in two of those three cases have actively been writing experimental and shipping code to add support for such a feature. This is amongst the highest level of demand I've ever seen for a feature in this spec (the last feature with this level of demand was probably <canvas>). I think it is imperative that we work to obtain a consensus on a single specification for this to enable interoperability on this quite important feature, otherwise we'll see quick fragmentation in this space. I believe HTML is the right place to define this due to the tight integration of any solution here with the loading and browsing context aspects of the current HTML5 spec, as should be obvious from the proposal below. Also, this proposal depends on changes to the database API to support implicit transactions and named and versioned databases per domain, as well as being a motivator for introducing a threading-like worker model (similar to Gears' WorkerPool) that can be "joined" from multiple browsing contexts that share an origin. Both of these are likely to be forthcoming changes to HTML5 based on feedback received to date, so it seems safe to assume these changes will be present while designing this offline Web apps feature. These features are assumed to be the solution for dealing with updating of database schemas in the presence of multiple running instances. Anyway, here's the proposal (v0.3): An application cache has: * A manifest URI by which the cache is identified. * A set of one or more resources with URIs and full HTTP headers and bodies (the cache). * A set of zero or more strings representing URI prefixes, each mapped to a URI found in the cache (the fallback pages and their opportunistic caching patterns). * A set of zero or more URIs that are not in the cache (the online-whitelist set). Browsing contexts are associated with application caches. A browsing context with a parent browsing context is always associated with the same cache as its parent browsing context. When a top-level browsing context is navigated, the browsing context must be associated with an application cache, as follows: * If the resource in question is being fetched via GET and is in one of the application caches: The user agent must pick an application cache from the shortlist of those caches that contain the specified URI, and associate the browsing context with it. The resource kept in that cache must then be examined. If it is an HTML resource and, when parsed with an HTML parser with scripting disabled, it has an "application" attribute, or, if it is an XML resource and, when parsed with an XML parser with scripting disabled, it has an "application" attribute, and that attribute doesn't match the URI of the cache selected, then unassociate the browsing context with that cache, and continue as if no cache could be found. * If you're offline, and any of the caches have fallback pages associated with a pattern that matches the URI of the resource being fetched, then: The user agent must pick an application cache from the shortlist of those caches that contain patterns that match the specified URI, and associate the browsing context with it. The UA must then use that page, but pretend that it came from the original URI. The page can work out which URI it is impersonating using the document.location API. * Otherwise: * If you're online, do the appropriate thing. * If you're offline, inform the user of the problem. Then, the document must be loaded. If, during load, an application="" attribute is found, then, if a cache with that manifest URI already exists, then, add the resource URI to that cache, and associate the browsing context with that cache. If no cache exists for that manifest URI, then create a cache for that manifest URI, and add the resource URI to that cache. (XXX How do we handle <?xml-stylesheet?> PIs? XXX) The process for picking an application cache is as follows: * The user agent must pick the application cache, of those that are shortlisted, that the user most likely wants to see the resource from, taking into account the following: - which application cache was most recently updated - which application cache was being used to display the resource from which the user decided to look at the new resource - which application cache the user prefers (Obviously if only one cache is shortlisted the choice is trivial.) When a subresource load is initiated by a browsing context that has an associated application cache, the user agent must proceed as follows: - If the resource is being fetched via GET, and the cache's manifest and the resources it points to has been fully downloaded, and the cache does not contain a mention of the specified resource (not even in the online whitelist), then fail the load. - Otherwise, if the resource is being fetched via GET and the cache contains that resource, then return the cached resource. - Otherwise, if you are online: Hit the network to fetch the resource. Do not store the resource in the cache. - Otherwise, fail the load. If a browsing context is associated with a cache, then run the update process asynchronously as soon as the load of the resource starts. The update process is as follows: - If the cache already existed before loading the resource, then: - set the offline state of every top-level browsing context that is using this cache to 'checking'. - fire a checkingupdate event at every top-level browsing context using this cache. - if not canceled, display UI saying that a check is in progress. - The given manifest must be fetched. - If this results in a 4xx or 5xx error, or a DNS error, or returns something of the a MIME type other than text/cache-manifest, or the manifest fails to get parsed, then: - if the cache was just created, then unassociate the browsing context with the cache, and just ignore the offline stuff. - if the cache already existed, then set the offline state of every top-level browsing context that is using this cache to 'idle' and fire an 'updateerror' event to them all. - if not canceled, mention error to user somehow. - abort these steps - If the download was successful but the manifest hasn't changed since the last time it was fetched for this cache, then: - set the offline state of every top-level browsing context that is using this cache to 'idle'. - fire a noupdate event at every top-level browsing context using this cache. - remove any ui indicating that an update is in progress. - abort these steps - Download the data from the manifest. - set the offline state of every top-level browsing context that is using this cache to 'updating'. - fire an updating event at every top-level browsing context using this cache. - if not canceled, show ui that update is in progress, and keep updating this ui when progress events would have fired (below). - if the cache was not just created: create a new cache. - for each file listed in the manifest as needing to be cached, fetch the file into the new cache, using the old cache as a cache if a new cache wasn't just created, honouring expiration, ETag, last-modified, etc. (Send progress events during this.) - if a new cache wasn't just created, also do this for each file listed in the old cache that matches the opportunistic caching pattern in the new manifest. (Send progress events during this.) - if a new cache wasn't just created, also do this for each file listed in the old cache that was added to the cache using the addition API. (Send progress events during this.) - for each file listed in the manifest as being a fallback page, download the fallback page in the same way as above, and store the prefix URI in the cache. (Send progress events during this.) - for each online-whitelisted URI in the manifest, store the URI in the cache for future reference. - if anything results in a 4xx, 5xx, or DNS error, then: - if the cache was just created, then unassociate the browsing context with the cache, and just ignore the offline stuff. - if the cache already existed, then set the offline state of every top-level browsing context that is using this cache to 'idle' and fire an 'updateerror' event to them all. - if not canceled, maybe tell user somehow. - abort these steps - Update the instances of the app. - set the offline state of every top-level browsing context that is using this cache to 'update ready'. - fire an updateready event at every top-level browsing context using this cache. - if not canceled, tell user update is ready (the page is expected to either call location.reload(), or the API to swap the new cache in, but both could be delayed, e.g. to wait for the user to finish the current task). The above algorithm mentions that progress events have to be sent -- this means that an 'updateprogress' progress event must be sent at every top-level browsing context using the cache while the download is progressing. Any file can be listed in the manifest, but the fallback patterns must match same-origin restrictions. API: Provide methods and/or properties for the following: * Add a resource to the cache. The resource persists (it's a permanent addition to the manifest.) * Remove a resource from the cache that was added using the aforementioned API. * Invoke the update() algorithm. * Check the current status of the updating: 'idle', 'checking', 'updating', 'update ready'. * Swap in the newest cache without a reload. (When I write up the feature in the HTML5 spec I will reply in more detail to the actual feedback received on the WHATWG list and posted on the HTMLWG wiki, so if I haven't covered something here that was sent to one of those locations, I will still address it later, fear not.) -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 21 September 2007 03:36:07 UTC