- From: Aryeh Gregor <Simetrical+w3c@gmail.com>
- Date: Sun, 18 Oct 2009 11:53:53 -0400
- To: Mike Kelly <mike@mykanjo.co.uk>
- Cc: Smylers <Smylers@stripey.com>, public-html@w3.org
On Fri, Oct 16, 2009 at 10:11 PM, Mike Kelly <mike@mykanjo.co.uk> wrote: > The benefits are realized in terms of automated cache invalidation. > > Modifying a resource should automatically invalidate all of its > representations. > (http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.10) > > In a server side reverse proxy cache scenario (a common use case for large > scale web applications); being able to rely on this automatic mechanism as a > sole method of cache invalidation ensures that the cache is refreshed as > infrequently and simply as possible, and that destination server usage is > kept to a minimum. This kind of efficiency gain can dramatically reduce > operating costs; particularly true in new 'pay-as-you-process' elastic > computing infrastructures. This is an interesting point. For instance, Wikipedia relies on a complicated mechanism where the software computes all the needed cache invalidations on the server side and sends them to its reverse proxies. The same goes for any complex application that supports HTTP caching. A simpler, standard way of doing that would certainly be valuable. In fact, the large majority of web applications don't support HTTP caching for most of their content, partly because of the difficulty of purging caches correctly (although also because it imposes serious limitations on locality of data). However, in addition to the usability problems that have been pointed out (bookmarking/copy-paste failure), I don't think your proposal is a flexible enough solution to be very useful in practice for cache invalidation. You suggest the case of an HTML version of a page plus a feed. But in practice, blogs and so on will often have many pages that need to be invalidated. If the front page of a blog changes, then both the HTML and RSS/Atom versions will have to be purged from cache, it's true. But so will a variety of other resources. If the blog has a "latest posts" menu on every page, for instance, every page will have to be purged from cache. I don't know much about blogs, so more concretely, I'll talk about wikis and forums. MediaWiki often needs to purge a lot of pages whenever one page changes -- one page can include another as a template, or link to it (and links can be given CSS classes based on properties of the linked-to page), and so on. This logic is complicated and certainly couldn't be reasonably moved out of the application. As for forums, the usual type of change that occurs is a new post. When a new post is made on a forum, a number of pages typically must be purged. For instance, the page for the thread itself needs to be purged, but so does the page for the subforum (so that the thread is bumped to the top). The page for every forum containing the subforum might also need to be purged, to display a correct "last thread" entry next to the link to the subforum that the thread is in. Again, this is not a case that would benefit much from your suggestion. Do you have an example of a specific application that currently uses server-side cache purging but could rely on your automated mechanism instead? It seems to be of very narrow utility.
Received on Sunday, 18 October 2009 15:54:28 UTC