- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 7 Sep 2007 00:46:33 +0000 (UTC)
On Fri, 24 Aug 2007, Maciej Stachowiak wrote: > > I think it's easy to extend Ian's idea in a way that keeps it really > simple for the simple case, but that works better for the multi-page > case or other complex cases where pages load some resources dynamically. > > <html application="manifest-file"> > > The manifest file would indicate all resources used by the web app, > including other pages, and other resources that may be loaded by the > current page but normally would not be at startup (another problem with > Ian's proposal IMO). Multiple pages that refer to the same manifest are > considered part of the same web app and share the same cache. If you > give an empty value for the application attribute, then the implicit > thing that Ian describes happens - the resources that the page actually > loads are the ones cached. Ok, new proposal: There's a concept of an application cache. An application cache is a group of resources, the group being identified by a URI (which typically happens to resolve to a manifest). Resources in a cache are either top-level or not; top-level resources are those that are HTML or XML and when parsed with scripting disabled have <html application="..."> with the value of the attribute pointing to the same URI as identifies the cache. When you visit a page you first check to see if you have that page in a cache as a known top-level page. If you do, skip the next two paragraphs; the 'new cache' flag is set to false. If you don't, you fetch the URL. If it has no application="" attribute, then do whatever the normal thing to do is. Ignore the rest of this. The presence of the attribute indicates that it's expecting an application cache to apply. The presence is detected at parse time, and must be present on the first <html> start tag before any other start tags. Check that the attribute's value is same-origin safe. If it isn't, pretend the attribute wasn't there (and ignore the rest of this). Otherwise, check to see if you already have a cache for the given URI. If you don't, create a new cache identified by the given URI. In any case, save this resource to the identified cache, as a known top-level page for that cache. Then, act as if you had known about the cache when you started (next step), except with the 'new cache' flag set to true. Load the page from the cache and display it. Any resources that the page tries to fetch using GETs that aren't XMLHttpRequest'ed must be taken from the cache, if available. When they aren't, the resources must be fetched then stored in the cache. Once the UA is ready to do so the UA must go on to the next steps. UAs may do this immediately, or may wait for the original page load to complete, or may delay it up to a UA-defined minimum delay. If 'new cache' is true, and the cache identifier URI is the same as the URI that was just downloaded and put in the cache: Do nothing. If 'new cache' is true, and the cache identifier URI is different from the URI that was just downloaded: Fetch the resource identified by that URI. Store it in the cache. If it's a manifest and it parses correctly, download all the URIs given in that manifest and add them to the cache. If any are HTML files which, when parsed with scripting disabled, trigger the application="" handling and have a value that points to the same URI as the one identifying this application cache, then mark them as known top-levels for this cache. If 'new cache' is false: Create a new cache. Fetch the resource with the URI of the cache identifier. If it's a manifest, and it has changed from what's in the last cache, and it parses correctly, download all the URIs in that manifest and add them to the new cache. If the manifest has an upgrader entry, use that as the upgrader as described below. Otherwise, if it's not a manifest but an HTML/XML file, and it has changed from what's in the last cache, use that as the upgrader as described below. If it's a manifest that misparsed, or if it's another kind of file, then act as if it the URI just pointed to the top level page being loaded (and use that as the upgrader as described below). If the newly updated cache doesn't contain the current top-level page, then fetch that too. When a file is fetched by the main page loading in a background browsing context, the loads are conditional loads, so that files that haven't changed since the previous update are directly copied from the old cache. If the newly update cache's copy of the top level page being shown is no longer categorised as a "known top-level" for this cache (e.g. because it doesn't have an <html application> attribute any more) then inform the user, e.g. an infobar saying something like "This application may no longer be available. (( View new page in a new window )) (( Delete application from cache )) (( Keep application in cache and check for updates later )) [x]". The first of these buttons would just show the background browsing context in the foreground. The second would delete the webapp cache and reload the page from the normal cache, and the third would just not do anything special. Don't run the upgrader in this case. If any of the files being updated in the new cache are 4xx or 5xx, or fail for some other reason (e.g. DNS errors, user went offline), then the UA should alert the user to this fact somehow (infobar maybe) -- "An error occurred while updating the application. (( View details )) [x]" -- and then wait a few minutes (or longer if it can tell it'll fail again) before trying again. Upgrader: Create a hidden browsing context. Load the upgrader in it. Just before onload, fire an 'upgrading' event to every instance of a top-level page using a cache with the same identifier. The event has a handle to the Window object of the hidden browsing context. After every 'upgrading' event has been fired, the 'load' event must be fired on the upgrader. After that happens, if any of the aforementioned instances are still using old versions of the cache, then the user agent may inform user they can reload to update. The Upgrader can do such things as updating the database schema between versions, and when there are multiple instances running, it allows them to negotiate who will do that work instead of it happening several times. Modal alerts (window.alert, .prompt, etc) in the background page can either raise an exception, be ignored, drop a message to a console, or possibly display a message over the top of the foreground app's browsing context. The manifest format has: a list of URIs. optionally a place to have an opaque string which can be changed arbitrarily (this gives authors a way to change the manifest when they want things to be refetched). optionally a URI for an upgrader (HTML file). We provide an API that can add files to the cache, and that can be queried to determine if we are in upgrader mode or not, and that can swap in a new cache without reloading the page, during the 'upgrading' event. (If a particular URI is in an application cache as a known top-level, but later is fetched and found to be a known top-level for another application, e.g. because two other pages both fetch that page in their manifest and the server returns pages with different application="" links for those two apps, then if the page is visited directly, it uses the app cache of the last cache to have found it as a top-level. This causes problems if visiting the page directly would return yet another cache identifier, as then you could only see that page if you'd never seen the others. I'm not clear about what to do about that.) Maybe we should check for updates more often than just when the top-level page is loaded. e.g. we could do it on a timer, or on every cache hit when online. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 6 September 2007 17:46:33 UTC