Re: Call for Adoption: Cache Digests for HTTP/2 from Kazuho Oku on 2016-06-27 (ietf-http-wg@w3.org from April to June 2016)

From: Kazuho Oku <kazuhooku@gmail.com>
Date: Mon, 27 Jun 2016 14:29:05 +0900
To: Alcides Viamontes E <alcidesv@zunzun.se>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CANatvzxsXAueKCACBpYts4X9Dd_Q-cm71L1g4p2Ze=r4M9Q_YA@mail.gmail.com>
2016-06-24 0:33 GMT+09:00 Alcides Viamontes E <alcidesv@zunzun.se>:
>
> Hello Kazuho,
>
> Thanks! Comments below.
>
> On Wed, Jun 22, 2016 at 11:29 PM, Kazuho Oku <kazuhooku@gmail.com> wrote:
>>
>> Hello,
>>
>> Thank you for your insights. I am eager to see your experiment results
>> once it gets ready.
>>
>> 2016-06-22 20:39 GMT+09:00 Alcides Viamontes E <alcidesv@zunzun.se>:
>> > Hello Cory,
>> >
>> > It's good to have your feedback. Below are answers to your comments, but
>> > I
>> > do expect to use this conversation to fill my gaps.
>> >
>> > On Wed, Jun 22, 2016 at 12:43 PM, Cory Benfield <cory@lukasa.co.uk>
>> > wrote:
>> >>
>> >>
>> >> > On 22 Jun 2016, at 09:28, Alcides Viamontes E <alcidesv@zunzun.se>
>> >> > wrote:
>> >> >
>> >> > This is bad for several reasons. AFAIK, sites don't have either a way
>> >> > to
>> >> > ask the browser to prematurely evict an expired representation that
>> >> > the
>> >> > browser would otherwise consider fresh. These two things together
>> >> > could
>> >> > allow a cache digest to grow indefinitely. Wouldn't that have a
>> >> > degrading
>> >> > effect on performance?
>> >>
>> >> This is presumably true of caching to begin with, correct? If the
>> >> browser
>> >> doesn’t consider the cached representation stale it is welcome to not
>> >> emit a
>> >> request for it at all, and simply to serve it from its cache. This
>> >> means
>> >> that the cache digest can only grow as large as the client cache allows
>> >> it
>> >> to grow, which I should certainly hope is not indefinitely large!
>> >
>> >
>> > Yes, this is related to caching in general. And it is the reason people
>> > have
>> > to add query strings for doing cache busting. This problem is a separate
>> > issue, but it interacts with cache digests in that old version of assets
>> > are
>> > kept in the cache and  therefore in the cache digest and the origin have
>> > no
>> > way of removing it. The origin can only create a new URL (say, via a new
>> > query string) that gets added to the cache and the cache digest.
>>
>> I share your concern (though I might disagree on how critical the issue
>> is).
>>
>> Actually, the draft provides a way to update fresh responses _if_ the
>> client and server agree.
>>
>> In the current version of the draft, we have intentionally defined
>> VALIDATOR and STALE as separate flags, to allow a client to send a
>> digest of fresh resources with their validators taken into account. A
>> server can use such digest (i.e. a digest with VALIDATOR flag set and
>> STALE flag not set) for pushing an updated version of the resource.
>>
>> So if the client is to accept push of a fresh response to replace an
>> already-existing fresh resource (the HTTP/2 spec. does not specify if
>> it should or not), the mechanism can be used, and in such case, cache
>> busting would no longer be necessary.
>
>
> That's really good news! What can be done to make this mechanism less a
> matter of browser goodwill?

Considering the fact that H2 push is optional (and Cache Digests for
HTTP/2 is an option to H2 push), I am afraid it might not become the
standard way to update a fresh response.

If we are to solve cache busting, it might be better to consider
adding a validator to the link-rel-preload header. For example, we
could add a etag attribute, that presents the etag value that is
_expected_ to be used together with the response that contained the
link header. Upon observing such header, clients can check their
cache, and in case the validators do not match, remove the response
from cache and issue a new request.

In other words,

  link: </style.css>; rel=preload; etag=deadbeef

would instruct the web browser to fetch /style.css if it is not
fresh-cached, or if the etag value of the cached response does not
match the value supplied by the header.

The pros of this approach would be that it could be used over both
HTTP/1 and HTTP/2, and that it does not require the browser to
implement H2 push. The cons would be that it would only be possible to
invalidate responses as part of the preload process.

>>
>>
>> >>
>> >>
>> >> However, it may be sensible to consider providing a SETTINGS field that
>> >> allows servers to flag a maximum size on a cache digest that it is
>> >> willing
>> >> to accept.
>> >
>> >
>> > But this leaves the server without any control about which things are
>> > made
>> > part of the cache digest. That's why we think scoping and an explicit
>> > eviction mechanism are better long term solutions.
>>
>> We could extend the current draft if scoping would be an issue, and it
>> would be possible to do so with keeping backward compatibility. For
>> example, we could add a new flag that indicates that the scope of the
>> digest is limited to certain mime-type and that the mime-type is
>> stored in the frame in addition to the digest-value.
>>
>> That said, does your concern about the size of the digest dissolve
>> without adding a method for a server to notify the client the maximum
>> permitted size of the digest (e.g. a SETTINGS field as suggested by
>> Cory)?
>
>
> After some reflection, I must agree that limiting the size of the digest is
> probably a good compromise until we know if scopes are really needed. This
> setting could even be used to make the digests opt-in, which is a plus.

Thank you for the clarification.

> ./Alcides.
>



-- 
Kazuho Oku
Received on Monday, 27 June 2016 05:29:36 UTC