[whatwg] AppCache-related e-mails from Ian Hickson on 2011-06-08 (public-whatwg-archive@w3.org from June 2011)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 8 Jun 2011 19:21:36 +0000 (UTC)
Message-ID: <Pine.LNX.4.64.1106062352550.19153@ps20323.dreamhostps.com>
On Mon, 31 Jan 2011, Michael Nordman wrote:
> On Mon, Jan 31, 2011 at 4:20 PM, Ian Hickson <ian at hixie.ch> wrote:
> > On Thu, 13 Jan 2011, Michael Nordman wrote:
> >>
> >> AppCache feature request: An https manifest should be able to list 
> >> resources from other https origins.
> >>
> >> I've got some app developers asking for this feature. Currently, it's 
> >> explicitly disallowed by the the spec for valid security reasons, but 
> >> there are also valid reasons to have this capability, like a webapp 
> >> that uses resources hosted on gstatic.
> >>
> >> Seems like a robots.txt like scheme where a site like gstatic can 
> >> declare that its "OK to appcache me from elsewhere" is needed.
> >>
> >> I've opened a chromium bug for this here... 
> >> http://code.google.com/p/chromium/issues/detail?id=69594
> >
> > Why do the valid security reasons not apply in this case?
> 
> The vendors of originA and originB have expressed that its OK for one to 
> appcache resources of the other. In practical terms this is to support a 
> single application being hosted on multiple 'origins'. Google 
> gstatic.com for one example... 
> http://superuser.com/questions/64716/what-is-gstatic-com
> 
> If I understand the reason for the restrictions on HTTPS as the 
> following...
> 
> "The requirement is intended to prevent hostile.example.com from forcing 
> content from checkout.google.com to be stored onto the user's machine, 
> so that a later offline attack involving grabbing the user's laptop 
> cannot retrieve the information."
> 
> That doesn't apply in this case because gstatic.com is not hostile to 
> gmail.com.

> [...suggestion to use CORS...]

On Mon, 31 Jan 2011, Jonas Sicking wrote:
> On Mon, Jan 31, 2011 at 2:57 PM, Michael Nordman <michaeln at google.com> 
> wrote:
> > I don't ?fully understand your emphasis on the implied semantics of a 
> > CORS request. You say it *only* means a site can read the response. I 
> > don't see that in the draft spec. Cross-origin XHR may have been the 
> > big motivation behind CORS, but the mechanisms described in the spec 
> > appear agnostic with regard to use cases and the abstract section 
> > seems to invite additional use cases.
> 
> The spec does say what the meaning of the Access-Contol-Allow-Origin 
> header means. You're trying to modify that meaning.
> 
> Consider things from a web authors point of view. The author develops a 
> website, bunnies.com, which contains a HTML page which performs 
> same-site, and thus trusted, XHR requests. The HTML page additionally 
> exposes an API based on postMessage to allow parent frames to 
> communicate with it.

As specced, this isn't possible. Nothing from an appcache is ever run with 
the origin privileges of an origin other than the cache manifest's origin.


> Since the site exposes various useful HTTP APIs it further has adds 
> Access-Control-Allow-Origin: <origin> Access-Control-Allow-Credentials: 
> true
> 
> to a set of the URLs on the site. Including the url of the static HTML 
> page. This is per CORS safe since the HTML page is static there is no 
> information leakage that doesn't happen through a normal 
> server-to-server request anyway.
> 
> However, with the modification you are proposing, an attacker site could 
> forever pin this page the users app-cache. This means that if there is a 
> security bug in the page, the attacker site could exploit that security 
> problem forever since any javascript in the page will continue to run in 
> the security context of bunnies.com. So all of a sudden the CORS headers 
> that the site added has now had a severe security impact.
> 
> That's why I'm hampering on the semantics.
> 
> Another issue is that if a site *is* willing to allow resources to be 
> pinned in the app-cache of another site, it might still not be willing 
> to share the contents of those resources with everyone. If we reuse the 
> existing CORS headers to express "is allowed to be app-cache pinned", 
> then we can't satisfy that use case.
> 
> For example a website could create a HTML page which embeds a 
> user-specific key and exposes a postMessage based API for third party 
> sites to encrypt/decrypt content using that users key. To allow this to 
> happen for off-line apps it wants to allow the HTML page to be pinned in 
> a third party app-cache. But it doesn't want to expose the actual key to 
> the third party sites. If CORS was used to allow cache-pinning, this 
> wouldn't be possible.

Well this problem doesn't exist for HTML pages, since they wouldn't ever 
run from the appcache, so the above wouldn't work anyway. But your concern 
is valid for, e.g., an image: if we use CORS to allow pinning HTTPS 
resources, there'd be no way to allow an HTTPS resource to be pinned 
without granting read access to that resource as well.


On Tue, 8 Feb 2011, Michael Nordman wrote:
> 
> Just had an offline discussion about this and I think the answer can be 
> much simpler than what's been proposed so far.  All we have to do for 
> cross-origin HTTPS resources is respect the cache-control no-store 
> header.
> 
> Let me explain the rationale... first let's back up to the motivation 
> for the restrictions on HTTPS. They're there to defeat attacks that 
> involve physical access the the client system, so the attacker cannot 
> look at the cross-origin HTTS data stored in the appcache on disk. But 
> the regular disk cache stores HTTPS data provided the cache-control 
> header doesn't say no-store, so excluding this data from appcaching does 
> nothing to defeat that attack.
> 
> Maybe the spec changes to make are...
>
> 1) Examine the cache-control header for all cross-origin resources (not 
> just HTTPS), and only allow them if they don't contain the "no-store" 
> directive.
>
> 2) Remove the special-case restriction that is currently in place only 
> for HTTPS cross-origin resources.

On Wed, 30 Mar 2011, Michael Nordman wrote:
>
> Fyi: This change has been made in chrome.
> * respect "no-store" headers for cross-origin resources (only for HTTPS)
> * allow HTTPS cross-origin resources to be listed in manifest hosted on
> HTTPS

This seems reasonable. Done.


On Tue, 1 Feb 2011, Patrick Mueller wrote:
> 
> I just tested Chrome beta this morning and saw nothing interesting in 
> appcache error events, however progress events have now grown "loaded" 
> and "total" properties (think those were the names, and I think they're 
> new-ish).  That's nice, as I can provide a progress meter during cache 
> load/reload.  I wouldn't mind having the URL of the resource being 
> loaded (that was just loaded?) as well as those numbers.  And for errors 
> it would be nice to know, in the case of an error caused by a cache 
> manifest entry 404'ing (or otherwise unavailable), what URL it was. HTTP 
> error code, if appropriate, etc.

In theory, we don't want to expose this information because it can be used 
to introspect intranets.

In general, the browser should definitely be able to help the user out in 
such a situation. Maybe pop up a notification "there was a problem 
downloading the app, click here for more information" or some such.

I suppose the intranet problem isn't that bad since you can do it with 
<img onerror> already... Maybe we can expose something on the 'error' 
event for appcache after all?

What kind of information would be most useful? Should it be in the same 
format from every browser or should it be detailed and freeform?


On Wed, 2 Feb 2011, Michael Nordman wrote:
> > On Mon, 20 Dec 2010, Michael Nordman wrote:
> >>
> >> What if we had something along the lines of <html 
> >> useManifest=''manifestFile">, which would do the association of the 
> >> doc with the appcache (so subresources loads hit the cache) but not 
> >> add the document to the cache?
> >
> > Why can't the pages just switch to a more AJAX-like model rather than 
> > having the main page still load over the network? The main page 
> > loading over the network is a big part of the page being slow.
> 
> The premise of the feature request is that the "main" pages aren't 
> cached at all.
> 
> | I tried to use the HTML5 Application Cache to improve the performances
> | of on-line sites (all the tutorials on the web write only about usage
> | with off-line apps)
> 
> As for "why can't the pages just switch", I can't speak for andrea, but 
> i can guess that a redesign of that nature was out of scope and/or would 
> conflict with other requirements around how the url address space of the 
> app is defined.

The whole point of appcache is to cache the main page, so that loading the 
page happens without hitting the network, thus enabling offline apps (and 
faster page loads).

If you're not loading the main page from the cache, what does this gain 
you that regular HTTP caching doesn't?

I don't understand the use case here.


On Fri, 11 Feb 2011, Jeremy Orlow wrote:
> On Fri, Feb 11, 2011 at 1:14 PM, Michael Nordman <michaeln at google.com> 
> wrote (on a chromium list, presumably):
> > UVL <andrea.doimo at gmail.com> wrote:
> > >
> > > I tried to use the HTML5 Application Cache to improve the 
> > > performances of on-line sites (all the tutorials on the web write 
> > > only about usage with off-line apps)
> > >
> > > I created the manifest listing all the js, css and images, and the 
> > > performances were really exciting, until I found that even the page 
> > > HTML was cached, despite it was not listed in the manifest. The 
> > > pages of the site are in PHP, so I don't want them to be cached.

You can already get your JS, CSS, and images cached. That's been possible 
for some 15+ years, using HTTP caching. No need for a manifest.


> > > From http://www.whatwg.org/specs/web-apps/current-work/multipage/offline.html :
> > > "Authors are encouraged to include the main page in the manifest 
> > > also, but in practice the page that referenced the manifest is 
> > > automatically cached even if it isn't explicitly mentioned."
> > >
> > > Is there a way to have this automating caching disabled?

The feature would break if it was disabled. Consider this scenario:

   User visits page in January 2012. The manifest is downloaded, all the 
   images, CSS, and JS are cached.

   User visits page in September 2013. The page has completely changed, 
   relies on entirely different JS, CSS, and images, but the browser uses 
   those from the cache. The page ends up being a complete jumble of old 
   images, new images, broken images, scripts doing unexpected things, old 
   CSS, new CSS...


> > > Note: I know that caching can be controlled via HTTP headers, but I 
> > > just wanted to try this way as it looks quite reliable, clean and 
> > > powerful.

HTTP caching is also "reliable, clean, and powerful". It also has the 
added advantage of already working. Plus it works with shared network 
caches, so people can get the benefit even if they haven't been to your 
site before, in many cases.


> > On Mon, Dec 20, 2010 at 12:56 PM, Michael Nordman <michaeln at chromium.org> wrote:
> > > This type of request (see forwarded message [above]) to utilize the 
> > > application cache for subresource loads into documents that are not 
> > > stored in the cache has come up several times now. The current 
> > > feature set is very focused on the "offline" use case.

Well yeah. It's called "offline application cache". :-)


> > > Is it worth making additions such that a document that loads from a 
> > > server can utilize the resources in an appcache? Today we have <html 
> > > manifest="manifestFile">, which adds the document containing this 
> > > tag to the appcache and associates that doc with that appcache such 
> > > that subresource loads hit the appcache. Not a complete proposal, 
> > > but... What if we had something along the lines of <html 
> > > useManifest=''manifestFile">, which would do the association of the 
> > > doc with the appcache (so subresources loads hit the cache) but not 
> > > add the document to the cache?

How would this be better than just using the regular HTTP cache? 
Everything in the appcaches is also going to be in the regular HTTP cache, 
unless it is explicitly marked as not being cachable. If it's marked as 
not cachable, the solution is to mark it as cachable, not to work around 
HTTP by finding another caching mechanism.


> > Waking this feature request up again as it's been requested multiple 
> > times, I think the ability to utilize an appcache w/o having to have 
> > the page added to it is the #1 appcache feature request that I've 
> > heard.
> >
> > [...]
> >
> > * More recently this has been requested in the context of an 
> > application that uses pushState to alter the url of the main page.

If the page is using pushState() with different paths (not just changing 
the fragment identifier), but the pages are still cachable, then it'll 
work just fine if you just put the file in the FALLBACK section. (Well, 
it's slightly slower than normal, since it tries to hit the network 
first. But it'll still work.)

If the pages aren't cachable, then the use of pushState() isn't really 
relevant. It's just the same as having many uncachable main pages, at 
which point, what's the point of using appcache? Just use a regular cache. 
It's not going to work offline anyway.


On Mon, 14 Feb 2011, Felix Halim wrote:
>
> I have a use case where it is preferable that the main page is not 
> cached:
> 
> Suppose you have a main page that changes based on it's ID:
> 
> http://example.com/page.php?id=10
> 
> The appCache will store each main page with different id in separate 
> cache, which is undesirable! And we DON'T want to cache the main pages, 
> since the content differs significantly (think of it as a forum 
> website).

The idea of the appcache feature is to enable offline usage. If you don't 
want it cached, how is it going to work offline?


> The main goal here is NOT to make the page offline, but to cache the 
> resources that the page uses (i.e, .js, .css, images, etc...) that are 
> very likely to be IMMUTABLE (particularly the jQuery.js and jQueryUI 
> css+images that almost every sites uses!).

Appcache only adds one feature: The ability to work offline.

Everything else that appcache does is already possible with regular HTTP 
caching.

So if you don't want to work offline, just use regular HTTP caching.


On Fri, 18 Feb 2011, Biju wrote:
>
> IE 9 is introducing meta-tags for custom jumplist actions (+ Pinned 
> Sites). Mozilla is also planning to do same 
> https://bugzilla.mozilla.org/show_bug.cgi?id=605222 Some website 
> (http://www.pcworld.com/ ) already started using it !!!
> 
> At present IE9 make jumplist using msapplication-* If we dont make any 
> standard now it will ugly to see pages with lot of meta tags named 
> appleapplication-*, ubuntuapplication-*, redhatapplication-*, 
> fedoraapplication-* and also tags for Web App support in Fennec 
> https://bugzilla.mozilla.org/show_bug.cgi?id=583750

Whether we make a standard or not, why would anyone use anything other 
than "msapplication-*" for this feature?


> Also I did not like current implementation with a meta tag for each 
> item. http://msdn.microsoft.com/en-us/library/gg131029 I wish whole list 
> was kept in an external file (like the proposed <link 
> rel="application-description" href="myapp.json">) to avoid this same 
> meta tag clutter on every page.
> 
> Did anybody discussed with IE team about already proposed <link 
> rel="application-description" href="myapp.json"> ? Looks like IE team 
> only just used Ian's comment for
>
> >> - a name
> >  <meta name=application-name content="Flickr">

I haven't seen anything from Microsoft regarding this, for what it's 
worth.

Anyway. I don't really understand the use case here. What problem are we 
trying to solve here? Do sites actually use this feature? Do users use the 
feature in the sites that offer it?


On Fri, 18 Feb 2011, Charles McCathieNevile wrote:
> 
> Perhaps it makes sense to think about using a single metadata file (as 
> widgets do), and allow Web apps based on a server to associate a 
> config.xml (or JSON equivalent? Why not reinvent the syntax a few 
> times...). You may also want a file manifest like appCache, but if you 
> want the thing to run offline you could just require that it use 
> appCache etc anyway.
> 
> Widgets don't do that because they have a packaging concept where 
> everything is already in the package, which is to allow for simple 
> offline installation and signature, whereas the webapp model requires 
> continually trusting the server. On the other hand, it seems that there 
> are plenty of people happy to trust facebook, google, farmville and so 
> on anyway, so we wouldn't be opening them to new security problems.

Without knowing exactly what problem we're trying to solve here, it's hard 
to evaluate these proposals.


On Wed, 2 Mar 2011, Edward Gerhold wrote:
> 
> I?ve found out, that i can not Cache my Joomla! Content Management 
> System. Of course i?ve read and heard about, that the application cache 
> is for static pages. But with a little change to the spec and the 
> implementations, it would be possible to cache more than static pages.
> 
> I would like to cache my Joomla! system. To put the scripts, css and 
> images into the cache. I would like to add the appcache manifest to the 
> index.php file of the Joomla Template. What happens is, that the 
> index.php is cached once and not updated again. I can not view new 
> articles. The problem is, that i can neither update the Master File, nor 
> whitelist it.

What problem are you trying to solve here?


> And this is, what my request or suggestion is about. I would like to 
> whitelist the Master file, where the appcache manifest is installed in. 

If the main index page isn't cached, then really there's no benefit to 
appcache that you can't already get just by using regular HTTP caching, 
as far as I can tell.


> Or i would like to update this file, or any file else, i would like to 
> update, on demand.

Not sure what this means.


> For the script i would like to add *applicationCache.updateMaster()*, 
> which forces the browser to fetch the file again.

You can do this with applicationCache.update() -- it updates everything, 
including the current file.

You always want to update everything at once, since otherwise you'll have 
mismatched versions.


> I think, this is impossible today, to update exactly this file. For the 
> function, i could add a button to my page, to let the user choose to 
> update the file. The second function would be 
> *applicationCache.updateFile(url)*, which could be triggered by a button 
> and script, too. I could let the user update certain articles.

If you just want to update data, then I would recommend having the data 
stored in a database locally (IndexDB, localStorage, FileSystem, or some 
such) and then update that over the network, rather than in appcache.


> With that i would like to suggest* applicationCache.addToCache(url)* to 
> add files manually or programmatic, which can not be determined by the 
> manifest. Urls like new articles (*), i would like to read offline. I 
> would like to add them to the cache, if the link appears, maybe on the 
> frontpage. I would have to add the manifest to the CMS anyways, so i 
> could add a few more functions to the page, of course. * 
> applicationCache.removeFromCache(url)* should be obvious and helpful 
> with the other functions. Good would be, to be able to iterate through 
> the list of cached objects and even the manifest, with the update, add, 
> remove functions, it would be very useful to work with the filenames and 
> parameters.
> 
> [(*) I could let the user decide wether he wants to download my mp3 
> files to the appcache or not, and fulfill the wish with the javascript 
> functions. Maybe he?s got no bytes left or wants only the lyrics.]

We'll probably add this feature eventually. It was in an early draft IIRC 
but we got rid of it, IIRC to simplify the API for the first version.


> The application cache is very powerful. But it is very disappointing, 
> that it is only useful for static pages. With a little improvement to 
> the Offline Web applications chapter, and of course to the browsers, it 
> would be possible to cache any Content Manager or dynamic page. And that 
> would let the appcache become one of the most powerful things in the 
> world.

HTTP caches already do most of this.


> I could read my Joomla! offline, could update the cached files, if i 
> want to, on a click or if the cache expires. I could let the half of the 
> CMS load from the cache. But for that, the index.php, where the manifest 
> is, has to be updateable. Correct me, if i am wrong. But this is not 
> possible today, the master file can not be influenced. And there is no 
> expiration or a possibility to update or manipulate the cache and even 
> no way to find out which files are cached, what would let me/us have 
> control over the Offline Web application.

I'm not sure I really follow here.

I don't really understand how offline access would work if we're not 
caching the main file...


> Oh, i forgot one thing: Wildcards in the manifest. And I think, 
> directories belong into the CACHE section, i got an error on any 
> directory there, i had to state the whole filename. You should 
> abbreviate that. But that is not so important against that what i wrote 
> down in this message above. Anyways, this completes my wishlist.

I don't see how wildcards would work. How would the browser know what to 
fetch?


On Thu, 3 Mar 2011, Michael Nordman wrote:
>
> 2) The ability to add(), remove(), and enumerate() individual urls in 
> the appcache. Long ago, there were interfaces on the appcache drawing 
> board to allow that. They got removed for a variety of reasons including 
> "to start simpler". A couple of years later, it may make sense to 
> revisit these kind of features, although there is another repository 
> also capable of storing ad-hoc collection of resources now (FileSystem), 
> so i'm not sure this feature really needs to be in the appcache.

Yeah, it may be better just to use the FileSystem for that... It's hard to 
know without really seeing how people envisage using this.

For offline use, FileSystem isn't going to work; for that we'd have to 
support adding files to the appcache. (Note that you can do that today by 
just opening the files you want to add in an iframe. It's a hack, but it 
should generally work.)


> @Hixie... any idea when the appcache feature set will be up for a growth 
> spurt? I think there's an appetite for another round of features in the 
> offline app developers that i communicate with. There's been some recent 
> interest here in pursuing a means of programatically producing a 
> response instead of just returning static content.

Who implements it currently? Is there a test suite? Those are the main 
things that would gate a dramatic addition of new features.


On Fri, 18 Mar 2011, Nikolas Coukouma wrote:
>
> Section 6.6.2 "Application caches" says
>   Zero or more URLs that form the online whitelist namespaces.
> 
>   These are used as prefix match patterns, and declare URLs that the
>   user agent will never load from the cache but will instead always
>   attempt to obtain from the network.
> 
> The above doesn't seem accurate since section 6.6.6 "Changes to the 
> networking model" was altered (in 2008) so that, in the case of an entry 
> matching the online whitelist, the entry is loaded from the cache.

The entires in the online whitelist aren't in the appcache at all, so they 
can't be fetched from it.

Or do you mean that "from the network" should mention that it doesn't mean 
to imply that local HTTP caches are ignored?


On Sun, 10 Apr 2011, Edward Gerhold wrote:
> 
> A programmer has to add some code to the CMS template or somewhere else, 
> that you can invoke the add function. The page would be added to the 
> cache. If i cache the start page and open an article and cache it too, 
> go offline, and read the cached frontpage again, and click on the link 
> of the other page i cached, the url should be read from the cache.
> 
> To give me the possibilty to grow or to shrink the cache, i should be 
> able to remove pages, too. Modifying the page has an obvious sense.

Before evaluating the above or many other suggestions you have made, I 
must first understand what problem it is you are trying to solve. Could 
you elaborate on what experiene it is you are trying to provide the user 
which you cannot currently provide?


On Fri, 1 Apr 2011, Edward Gerhold wrote:
> 
> The appCache is not ready for storing dynamic data. This could be done 
> by the user by simply pressing a "cache this" button or a link or some 
> other function in a script.

What do you mean by "dynamic data"?


On Wed, 13 Apr 2011, Edward Gerhold wrote:
> 
> The problem is:
> 
> Cached once, the files are NEVER being updated again, except the manifest
> changes a byte.

Right. The manifest is used as a proxy for the state of the server.


> Second thing, the rename of the URLs, or better a local and a network 
> identifier is neccessary. It would be good, if the user agent could care 
> for local identifiers, too. The programmer would sometimes have to give 
> the page a new name to prevent overwriting another file (with access to 
> the cache, maybe even dupe names but other array positions work?).

Not sure what this means.


On Sat, 16 Apr 2011, Edward Gerhold wrote:
> 
> I came to the conclusion, the cache is linked too tightly to the 
> manifest. The cache is good how it is as long as it isn?t developed for 
> dynamic websites and caches. The manifest should only be the basis of 
> the cached site, and the programmatic addition and removal of the pages 
> should expand or shrink the array of files. This is how the story goes 
> so far. I am working on my draft or am not working on and try to imagine 
> and formulate the steps of each element i suggest.

To evaluate this conclusion it weould be helpful to more clearly 
understand the use case.


> *Another thing which i would like to introduce together with the better 
> api is the "Expiration of the Cache".* Like i already knew, there have 
> some passages of the original to be changed. I don?t think, that i want 
> to setup a v2.0 or a module for the cache, no, i want to change the 
> original to use all functions. I think an expiration of cached files 
> should be introduced together with the dynamic application cache.
> 
> The local identifier, i suggested, i think, i?ll add it, but leave the 
> handling of it blank. Means, the programmer has the possibility to give 
> local identifiers, but has to handle it alone. If i come to that chapter 
> and find out, how the user agent should treat it, i?ll write it down. 
> But so far, i?d like to leave the string empty, for the obvious cases, 
> that a website uses index.php for all files. The application programmer 
> adding the cache functions should be able to make use of the local ids 
> to retrieve the files. And here is the case, where the user agent should 
> call the right page in the cache, if the local identifier is referred. 
> Ok.
> 
> Well, this was about, you could overread it, the introduction of the 
> expiration date of cached files should be made together with the 
> programmatic access to the cache. It will be useful to let certain files 
> expired then. And, i can?t cite the original, i would like to end with 
> my first words, i think, the cache is bound to tightly to the manifest 
> at the moment.

Not sure what you are proposing here, so it's hard to evaluate it.


On Sun, 17 Apr 2011, Jukka K. Korpela wrote:
> 
> I'd like to see a simple summary of the benefits of an application 
> cache, as compared with normal caching - not with some hypothetical 
> situation without any caching.

The benefit of the application cache is simple: it allows a user to visit 
a Web page while offline, if they have visited it while online and if that 
page declares a manifest.

Regular caching (i.e. without a manifest) doesn't work with this because 
it doesn't provide a way to declare the set of resources that are needed 
for a page to be usable offline.


> As a drawback, when _any_ change is made to any of the files, the 
> manifest needs to be modified and all user agents will have to download 
> the entire application (all files) when online next time.

They only need to check all the files -- only the changed ones get 
downloaded.


On Tue, 19 Apr 2011, Ilkka Huotari wrote:
> [Anne wrote:]
> >
> > If you use a fallback namespace it will always try to do a network 
> > fetch before using the fallback entry so why is there a need for a 
> > NETWORK entry in the cache manifest?
>
> Now, I haven't probably thought this enough, but could the FALLBACK and 
> NETWORK be combined into one NETWORK? Both are doing pretty much the 
> same thing after all?
> 
> Here's how it would work:
> 
> NETWORK:
> <item> [optional-fallback-item]
> 
> More specific entries would take precedence, i.e. /file.html would be
> more important than /file or / or * ... Example:
> 
> NETWORK:
> *
> / /offline
> /file.html /offline-for-file.html
> 
> This way
> - "*" would map like it does in the current spec
> - "/ /offline" would map like it does in the FALLBACK section/current
> spec and would take precedence over "*" because it's more specific
> over *"*
> - "/file.html /offline-for-file.html" would take precedence over all of these.
> 
> Benefits: Making things simpler, easier for the programmer to 
> understand. Faster to learn, less bugs, better code?

I'm not convinced that's much simpler. It's the same, except that you've 
replaced two sections with different simple syntax with one section with 
more complicated syntax.


On Wed, 20 Apr 2011, Ilkka Huotari wrote:
>
> I'm trying to figure out if there is any other difference between 
> NETWORK and FALLBACK sections than FALLBACK section having the fallback 
> resource. (I hope I'm not bothering people with my questions, but I also 
> hope that these questions could help somebody else.)
> 
> So... is there any other difference between them? I did some testing in 
> Chrome and Firefox and while the behavior was not actually identical 
> between those browsers, it seemed that there isn't any other crucial 
> difference between the sections.

It's a pretty big difference and it has several side-effects, but yes, 
that's the only difference.


On Mon, 23 May 2011, Nicholas Zakas wrote:
>
> The spec currently states this about the obsolete and error events on 
> window.applicationCache (5.6.1.1):
> 
>  * Obsolete - The manifest was found to have become a 404 or 410 page, 
> so the application cache is being deleted.
>
>  * Error - The manifest was a 404 or 410 page, so the attempt to cache 
> the application has been aborted.
> 
> Later on (5.6.4), the spec states about 404 or 410 manifest files:
> 
>  * For each cache host associated with an application cache in cache 
> group, create a task to fire a simple event named obsolete that is 
> cancelable at the ApplicationCache singleton of the cache host, and 
> append it to task list. The default action of these events must be, if 
> the user agent shows caching progress, the display of some sort of user 
> interface indicating to the user that the application is no longer 
> available for offline use.
>
>  * For each entry in cache group's list of pending master entries, 
> create a task to fire a simple event that is cancelable named error (not 
> obsolete!) at the ApplicationCache singleton of the cache host the 
> Document for this entry, if there still is one, and append it to task 
> list. The default action of this event must be, if the user agent shows 
> caching progress, the display of some sort of user interface indicating 
> to the user that the user agent failed to save the application for 
> offline use.
> 
> This seems to indicate that the obsolete event is always fired and the 
> error event may optionally fire afterward.

No, the list of objects that the two events are fired at are mutually 
exclusive. (One is those that already have the cache associated, and the 
other is those that do not yet have the cache associated.)


On Sat, 28 May 2011, Felix Halim wrote:
>
> AFAIK, currently there is no storage limit for the App Cache.

There is a limit on App Cache:

   http://www.whatwg.org/specs/web-apps/current-work/complete/offline.html#disk-space

Exactly how it works is up to the UA. Personally I think the same quota 
should be used for all client-side storage mechanisms per origin group, 
where origins are grouped so that all related subdomains, ports, etc, 
count as the same origin.


On Sun, 29 May 2011, Bjartur Thorlacius wrote:
>
> User agents may store expired pages for offline use. Internet Explorer 
> and Firefox have 'Work offline' modes automatically enabled on complete 
> disconnection from the network. Currently, only cached pages and sites 
> explicitly selected by the user are available offline, but given enough 
> disk space, user agents might keep all files of MIME type "text" (e.g. 
> text/html and text/plain) - or even all files. The variation on 
> constraints between systems is such that even looking only at my desk 
> there's a system with over 1.7GiB of free read-write memory (0.5MiB 
> magnetic, 1.3GiB volatile RAM) and another one with under 300MiB 
> (volatile RAM). I don't want authors to be able to use up my memory by 
> storing most or all content for offline use, nor to unnecessarily loose 
> access to content when storage space is plentiful.

This is the kind of thing user agents should offer their users.

Note that the constraints are mainly aimed at preventing abuse. It's easy 
to imagine a hostile site sending the user (without the user's knowledge) 
on a journey across dozens of origins each storing terabytes of data in 
local storage, databases, cookies, appcache, etc.


On Mon, 30 May 2011, Felix Halim wrote:
> 
> Suppose I have a web page and want to store it in an App Cache. This web 
> page requires a few resources (.ccs, .js, images, etc..). But all of 
> them are "static" resources. I want to store "dynamic" resources as well 
> for that particular page only (not shared). Think of the dynamic 
> resources as "data" that changes from time to time for that particular 
> page only. localStorage can be used to store the "dynamic" resources, 
> but localStorage has very limited quota and it is shared to the entire 
> domain. Different unrelated pages in the same domain will use the shared 
> quota!
> 
> Currently I can "hack" the App Cache to simulate the pageStorage like 
> this:
> 
> We can turn one of the .js files "dynamic" by updating the .js file, 
> then edit the MANIFEST file a bit, so that the browser re-download *ALL* 
> the resources again.  This way, the .js file quota gets in the App Cache 
> quota which is currently *UNLIMITED*.

It's no less limited than any other storage mechanism, per spec.


On Tue, 31 May 2011, Felix Halim wrote:
> 
> The current App Cache design updates the cache to the latest version in 
> the background when the user visit the page for the second time and then 
> it needs to refresh the page to actually update the display. This is 
> annoying since the user will first see stale data, then a few second 
> later, it's updated with a giant refresh (including all the static 
> resources).

You shouldn't store data in the appcache, only logic, otherwise yes, the 
user will always be one version behind.

Note that there is no giant refresh unless the page makes it so.


> This is because the App Cache is too COARSE grained. It doesn't know 
> what actually changes (which data are static, which data are dynamic). 

Right. It uses regular HTTP semantics to update the cache.


> That is another reason why we need pageStorage: to separate the dynamic 
> and the static resources.

Don't we already have enough ways to store data?


On Wed, 1 Jun 2011, Bjartur Thorlacius wrote:
>
> Caches should still be allowed to refetch resources just before they're 
> expected to be used. I might want my home computer to fetch the latest 
> news in the morning and evening, so I can start reading when I wake up 
> and when I get home from school.

The spec allows the UA to run the update algorithm at any time.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 8 June 2011 12:21:36 UTC