Re: Submitted new I-D: Cache Digests for HTTP/2

Hello. The general thrust of this I-D seems like a useful optimisation 
of HTTP/2 server push. It is wasteful to push a representation to a 
client when the client already has a fresh copy cached. But the reverse 
is equally true, I think...

I therefore support the idea of including version information in the 
client-generated cache digest. This would allow a server to push a more 
up-to-date version of a representation in the case where that 
representation has been updated before the originally stated expiry. 
This allows a server to supply the freshest possible version, overriding 
the client's (in this case mistaken) belief that its cached copy is 
still fresh.

You suggest below that a client would ignore such a push because it 
still believes its copy to be fresh, thereby defeating the server's 
attempt to push a fresher version. Naively, my first thought was that 
versioning metadata (last-modified and/or etag) in the pushed HEADERS 
frame should be enough to convince a sensible client otherwise. But then 
I realised that the client may already have rejected the PUSH_PROMISE 
before it even gets to see the pushed HEADERS. One slightly wacky idea 
would be for the server to include the up-to-date modification datestamp 
and/or entity tag values in its PUSH_PROMISE frame (i.e. the simulated 
client request) as (respectively) if-modified-since and if-none-match 
request headers. I'm not sure even this would be enough to invalidate 
the client's cache entry though. Hmm... Any better ideas?

The next question concerns syntax. Elsewhere in the thread I think it 
has been suggested that there could be two types of digest transmitted 
from client to server: one ("fresh") generated from URLs the client 
believes to be fresh, the other ("if-modified-since") based on URLs that 
it believes to be stale. Reading between the lines, am I right in 
thinking that the latter is intended to be a sort of conditional client 
request, with a "304 Not modified" response being pushed in response for 
those representations that turn out not to be stale?

This all seems quite complicated, and I find the combination of 
parameter name and semantics a bit confusing. For simplicity's sake I 
might be inclined to just include the versioning information as standard 
in all cache digests, in spite of the resulting overhead. Then the 
server can decide what needs to be pushed in response after comparing 
the versioning metadata in the received digest with the (potentially 
more up-to-date) information available to the server. And, as an added 
bonus, you then don't need to worry about defining what a pushed 304 
response means :-)


On Tue, 12 Jan 2016 10:04:00 +0900 Kazuho Oku <kazuhooku@gmail.com> wrote:
> 2016-01-11 2:11 GMT+09:00 Alcides Viamontes E <alcidesv@zunzun.se 
> <mailto:alcidesv@zunzun.se?Subject=Re%3A%20Submitted%20new%20I-D%3A%20Cache%20Digests%20for%20HTTP%2F2&In-Reply-To=%3CCANatvzxcKS46iAqAdfBHuWPt5k3XkR79NDMPPtDakOb2jPAywA%40mail.gmail.com%3E&References=%3CCANatvzxcKS46iAqAdfBHuWPt5k3XkR79NDMPPtDakOb2jPAywA%40mail.gmail.com%3E>>:
> > ...
> > Here are the issues that I see:
> >
> > 1.- In its current wording, no information about which version of a
> > representation the browser already has is present in the cache digest.
> > That information can be included in the URL itself (cache busting),
> > but then it becomes a concern for web-developers, adds complexity to
> > their work, and bypasses the mechanisms that HTTP has in place for
> > maintaining cache state.  It also increases space pressure in the the
> > browser's cache as the server is left with no means to expire old
> > cached contents in the browser.
>
> That is a very good point.
>
> Let me first discuss the restrictions of the cache model used by HTTP,
> and then go on to discuss what we should do if we are to fix the point
> you raised.
>
> First about the restriction; the resources in the cache can be divided
> into two groups: fresh and non-fresh.  A server should never push a
> resource that is considered as fresh in the client's cache.  Clients
> will not notice the push / the HTTP/2 allows client to discard such
> push.  Therefore, a CACHE_DIGEST frame
> must include a filter that marks the resources that are marked as
> being fresh.  That is what the current draft specifies.

I think this is a reference to the sentence at the start of Section 2.1 
stating: "The set of URLs that is used to compute Digest-Value MUST only 
include URLs that share origins [RFC6454] with the stream that 
CACHE_DIGEST is sent on, and they MUST be fresh [RFC7234]." In other 
words, according to draft 00, the client-generated digest only includes 
the URLs of cached representations it considers to be fresh.

> Next about the point of including version information (e.g.
> Last-Modified, ETag) in the cache digest.  I believe we can add a
> second Golomb-coded set to the frame that uses hash(URI + version
> information) as the key.  A server can refer to the information to
> determine whether if it should push a 304 response or a 200 response.
>
> The downside is that the CACHE_DIGEST frame may become larger (if the
> server sends many responses that would become non-fresh), so it might
> be sensible to allow the client to decide if it should send the second
> Golomb-coded set.
>
> In addition, we should agree on how to push 304 response.  My
> understanding is that HTTP/2 spec., is vague on this, and that there
> has not yet been an agreement between the client developers on how it
> should be done.
>
> Once that is solved, I think we should update the I-D to cover the
> version information as well.

Kind regards,

-- 
Richard.

Received on Friday, 22 January 2016 17:47:38 UTC