Re: I-D Action: draft-nottingham-linked-cache-inv-00.txt from Brian Pane on 2011-06-02 (ietf-http-wg@w3.org from April to June 2011)

From: Brian Pane <brianp@brianp.net>
Date: Thu, 2 Jun 2011 09:22:08 -0700
To: Mark Nottingham <mnot@mnot.net>, Poul-Henning Kamp <phk@phk.freebsd.dk>, httpbis Group <ietf-http-wg@w3.org>, Balachander Krishnamurthy <bala@research.att.com>, cew@cs.wpi.edu
Message-ID: <BANLkTinz2t=THUrv8fdVUnSOCvTPUS+Eug@mail.gmail.com>

On Tue, May 31, 2011 at 9:47 PM, Mark Nottingham <mnot@mnot.net> wrote:
>
> On 01/06/2011, at 2:46 PM, Brian Pane wrote:
>
>> On Tue, May 31, 2011 at 3:05 PM, Mark Nottingham <mnot@mnot.net> wrote:
>> [...]
>>> I keep the associations in memory (hashed in some cases to preserve space), and that seems to work well.
>>
>> Does that mean that your implementation, upon seeing a response for
>> resource A that contains a Link header that invalidates resource B,
>> will persistently retain the knowledge that changes to A should
>> invalidate B?
>
> Yes, until another response is received with differing information.
>
>> I'd been assuming that the invalidation of B would be a one-time
>> event: the receiving client or intermediary would invalidate B in its
>> cache and forget about the message thereafter.  That's a scalable
>> model (in practice, implementations limit the max total header size
>> they'll allow per message, and that puts an upper bound on the number
>> of invalidations that a single response message can trigger).
>> Retaining the associations persistently is a much harder model to
>> scale.
>
>
> Remember that the association is bounded by the freshness lifetime of the response it was carried in; that keeps things reasonable.

Depending on one's cache implementation, though, an expired resource
may live in the cache for a very long time - e.g., until a client
requests it or it's evicted to make room for something else.  Most
proxy implementations that I've seen don't proactively sweep their
caches to find stale entries, for good performance reasons.  In the
general case, the association data can have O(n*m) size, where n is
the number of resources in a cache and m is the number of related
links specified per resource, and my concern is that that's too high a
per-resource tax to impose on cache implementors.

But are there scenarios where maintaining the association data within
the cache would work better than treating the invalidation headers as
a one-time operation?  I'm focusing mainly on reverse-proxy use cases,
so the stored-association model may have some forward-proxy or client
benefits I'm overlooking.

Thanks,
-Brian

Received on Thursday, 2 June 2011 16:22:59 UTC