Re: I-D Action: draft-nottingham-linked-cache-inv-00.txt from Mark Nottingham on 2011-05-31 (ietf-http-wg@w3.org from April to June 2011)

From: Mark Nottingham <mnot@mnot.net>
Date: Tue, 31 May 2011 14:35:27 +1000
To: Brian Pane <brianp@brianp.net>
Cc: httpbis Group <ietf-http-wg@w3.org>, Balachander Krishnamurthy <bala@research.att.com>, cew@cs.wpi.edu
Message-Id: <3207DB80-6967-4A9D-A990-6241992B541D@mnot.net>

Hi Brian,

Yep, this sounds very much like Bala and Craig Wills' work on Piggyback Cache Invalidation more than a decade ago -- See <http://www2.research.att.com/~bala/papers/> and scroll down to "Web Caching."

Using the Link header makes sense here, and I think it'd be desirable to have a separate max-age (like that done in LCI) to allow people to specify longer freshness lifetimes when the cache understands the extension and is willing to invalidate based upon it.

The question in my mind is whether it makes sense to combine PbCI and LCI into a single mechanism, or to keep them separate. The downside to separating them is that it might be necessary to have *two* new max-ages, which wouldn't be great (especially considering figuring out how they'd interact).

I would observe that LCI is tilted towards deployment in "reverse" proxy caches, where PbCI seems a bit more suitable for forward proxies and browser caches.

The other question, of course, is whether any cache vendors (intermediary or browser) would implement. For experimentation purposes, it might be possible to shoehorn it into Squid in a similar manner to LCI was (i.e., with a helper process).

Bala/Craig, any thoughts? The draft under discussion is <http://tools.ietf.org/html/draft-nottingham-linked-cache-inv-00.html>.

Cheers,

On 28/05/2011, at 8:40 PM, Brian Pane wrote:

> What do you think about allowing the origin server (or an intermediate
> proxy, for that matter) to indicate a new Last-Modified timestamp for
> a related resource, rather than just indicating that any cached copy
> of the related resource should be considered invalid?
> 
> Here's the use case where I think a Last-Modified update would be useful:
> 
> Consider a website on which most HTML pages reference a stylesheet,
>    <link rel="stylesheet" href="/common.css">
> In common practice, common.css is with a large max-age value to
> encourage downstream caching.  Eventually a web developer will modify
> the document at the origin server, and it will become important that
> any client that subsequently requests any HTML resource from the site
> must also fetch the new version of the stylesheet.  A common solution
> for this is to add versioning information to the URI -- e.g.,
> /common.css?v=2 -- but that's a cumbersome technique for many reasons.
> 
> The linked cache invalidation isn't quite a match for this scenario.
> The server could send an inv-by the next time each client requests any
> HTML page,
>    Link: </common.css>; rel="inv-by"
> but that should only be sent once per client, so as to avoid
> invalidating the cache on every subsequent HTML resource request.  And
> in the general case it's impossible to keep track of what clients have
> already received the inv-by.
> 
> But what about the following variant?
>    Link: </common.css>; rel="last-mod"; value="[some valid HTTP-date]"
> Upon seeing that header in the response, the recipient (whether the
> origin client or an intermediate cache) could decide whether to
> invalidate its cached copy of /common.css based on the supplied
> last-mod timestamp.  It would be safe for the server to send that
> header to the same client (or intermediate cache) arbitrarily many
> times.
> 
> -Brian

--
Mark Nottingham   http://www.mnot.net/

Received on Tuesday, 31 May 2011 04:35:58 UTC