Re: I-D Action: draft-nottingham-linked-cache-inv-00.txt from Mark Nottingham on 2011-06-02 (ietf-http-wg@w3.org from April to June 2011)

From: Mark Nottingham <mnot@mnot.net>
Date: Fri, 3 Jun 2011 09:20:26 +1000
To: Brian Pane <brianp@brianp.net>
Cc: Poul-Henning Kamp <phk@phk.freebsd.dk>, httpbis Group <ietf-http-wg@w3.org>
Message-Id: <624B3043-FA2D-413E-9113-2563D56B6759@mnot.net>

[ taking Bala and Craig off the thread, as this is getting away from the reason I added them ]

On 03/06/2011, at 2:22 AM, Brian Pane wrote:
> 
> Depending on one's cache implementation, though, an expired resource
> may live in the cache for a very long time - e.g., until a client
> requests it or it's evicted to make room for something else.  Most
> proxy implementations that I've seen don't proactively sweep their
> caches to find stale entries, for good performance reasons.  In the
> general case, the association data can have O(n*m) size, where n is
> the number of resources in a cache and m is the number of related
> links specified per resource, and my concern is that that's too high a
> per-resource tax to impose on cache implementors.
> 
> But are there scenarios where maintaining the association data within
> the cache would work better than treating the invalidation headers as
> a one-time operation?  I'm focusing mainly on reverse-proxy use cases,
> so the stored-association model may have some forward-proxy or client
> benefits I'm overlooking.

I was hoping to avoid digging through the implementation to remind myself of how it was done; oh well ;)

Basically, it remembers the association for the freshness lifetime plus a small window. Once the object becomes stale, normal operation of Squid (like any conformant HTTP cache) is to attempt to revalidate, only using the stale response if there's a network error, etc. This is considered desirable behaviour; if the publisher doesn't want a stale response to be used, they have ways to control that (e.g., must-revalidate, stale-if-error).

It is effectively one-shot, because we don't just mark them stale, we HTCP CLR them, which IIRC in Squid remove the object (it doesn't just mark it stale), and the manager removes the association at the same time. When a new response gets cached, any associations in it will get re-added to the manager.

I was very concerned about the memory overhead of this scheme, and took a number of steps to make it more compact, but in the end it wasn't as bad as I feared; I don't have specific numbers on hand, but it hasn't been a big problem in our deployments (which *are* on a large scale, as reverse proxies).

Cheers,

--
Mark Nottingham   http://www.mnot.net/

Received on Thursday, 2 June 2011 23:20:55 UTC