Re: Pushing 304's from Greg Wilkins on 2014-11-30 (ietf-http-wg@w3.org from October to December 2014)

From: Greg Wilkins <gregw@intalio.com>
Date: Mon, 1 Dec 2014 08:40:22 +1100
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAH_y2NGjcvLgDOHp2yuDoH_5UuAm-1BKgbXSRQVaALGO-7Bh2A@mail.gmail.com>
Martin,

ack that modifying existing headers should be avoided.    I think that
Julian's suggestion of
pushing a HEAD response captures the semantics that I was striving for, so
such a modification
is not needed... so long clients are prepared to accept such pushed
responses.

With regards to heuristics,    we are currently using the presence of a
conditional header to indicate that the clients cache is probably hot and
thus we do not push any associated content.

The consequence of this is that a fresh visitor to the site gets pushed
resources and a minimal render time.   But a returning visitor to the site
will not get pushed resources and often has to expend an additional round
trip sending a batch of conditional requests for the associated
resources.   So it is a bit poor that a regular visitor can get worse
render times than a fresh visitor.

But a good suggestion to build a cache of known modification times and
etags to use when generating conditional headers is the push promise.
I'll give that a go in the next few days.    Any browsers out there likely
to grok such pushed 304's?

cheers



On 1 December 2014 at 03:15, Martin Thomson <martin.thomson@gmail.com>
wrote:

> This is a good discussion to have.
>
> The model that we have in place builds on cache updates, which means
> that the text you recommend isn't necessary.  We instead rely on a
> client identifying whether a response in a cache can be used (as in
> https://tools.ietf.org/html/rfc7234#section-4).  The client
> implementations of push that I am aware of create what is effectively
> a separate cache for pushes, but the same basic logic applies.
>
> The central concern of your piece is something that has concerned me
> too: how does an intermediary - in this case, the server framework,
> which is very much in an intermediary role - generate a push once it
> has identified a desire to do so.
>
> In your example, it might be more work, but Jetty could store etags or
> dates for resources that it has seen, and it could construct the
> If-None-Match or If-Modified-Since header fields itself for those
> requests.  It would probably need some sort of heuristic to determine
> whether it wanted to go for the 304 based on the client's initial
> request.  Also, Jetty can make a full request and identify if the
> resource is identical to what it *thinks* the client might have and
> turn that response into a 304 on its own.  That's trickier for
> entities of indeterminate length though...
>
> That is how I always imagined doing this, but I can see there being
> value in surfacing some new way to construct the request. Making this
> sort of heuristic unnecessary would be good, but I'd caution against
> modifying an existing header field.
>
> --Martin
>
> p.s., yes, not as many sites use Vary: Cookie as should :)
>
> On 27 November 2014 at 14:13, Greg Wilkins <gregw@intalio.com> wrote:
> >
> > I wondering if we may need to change to the semantics of 304 responses a
> > little  to better support push.
> >
> >
> > The situation that I'm thinking of is when a browser has a cache full of
> > associated resources, but that all need to be validated.  Without push,
> what
> > we typically see is:
> >
> > request for page.html with I-M-S or I-M header
> > 304 response sent for page.html
> > requests for logo.png, page.css, page.js each with I-M-S or I-M headers
> > 304 responses sent
> >
> > What we would like to see with push is
> >
> > request for page.html with I-M-S or I-M header
> > 304 response sent
> > 304 responses pushed for logo.png, page.css, page.js
> >
> > However, the problem is what etag or date should the pushed 304's be
> > relative to?  Typically the pseudo request used for a push promise will
> be
> > based on the original request for page.html, so it will get the same
> > cookies, user-agent and other headers.   But the etag/date from If-Match
> or
> > If-Modified-Since headers for the page.html are not applicable to
> associated
> > resources.
> >
> > So in some cases, the server can look up the associated resources to
> > determine their last modified time or etag and to put that into the
> psuedo
> > request of the push promise.    The client will then have to know that
> when
> > it receives a pushed 304 response that it needs to check the etag/date in
> > the push promise request to see if it matches the version it has in its
> > cache (and if not re-request).   I think that is something that perhaps
> we
> > should call out as a requirement for clients?   Maybe 6.6 could be
> updated
> > to have text like:
> >
> > A receiver MUST check that the headers within a received PUSH_PROMISE
> match
> > the headers that would have been sent in a normal request for that
> resource.
> > Specifically cookies and conditional headers should to be verified and
> the
> > stream rejected if they do not match the receivers expectations.
> >
> > I think this is a MUST rather than a SHOULD because I can see situations
> of
> > requests coming in for expired sessions.  For example if the page.html
> > request contains session cookie then pushed resources will be generated
> > using the same cookie value, possibly before page.html is generated.  If
> > subsequent processing of page.html invalidates the existing session and
> > creates a new one with a new cookie value, then the content sent for the
> > page.html and the associated resources may be for different session!.
>  Of
> > course servers should probably check the session before doing the push,
> but
> > as that can involve the application, that may not always be possible.
> > Thus clients MUST check that the pushed resources are indeed the ones
> they
> > would have asked for  -including cookies and conditional headers.
> >
> >
> > Another issues is that in many cases it is not possible for the server to
> > actually determine the etag/date to use in a psuedo request.  For example
> > our server http://webtide.com is running http2 push from it's jetty
> server,
> > but the pages themselves are generated from a co-located wordpress
> instance
> > (PHP).     When Jetty sends a push promise with a psuedo request, it
> > simultaneously proxies that same request to the PHP instance to get the
> > response.   Thus the http2 server cannot easily retrieve the etag or last
> > modified date.   I think having http2 in a proxy like this will be a
> > frequent deployment mode.
> >
> > What I think is required is a semantic that asks for a 304 to be sent
> that
> > includes the etag and/or last-modified date.   For example, perhaps we
> could
> > define '*' as a value for I-M-S and I-M to mean always send a 304, but
> > include the reference tag/date.  So in the above case, the psuedo
> requests
> > generated for logo.png, page.css, page.js would look like
> >
> >   :method: GET
> >   :path: /logo.png
> >   :authority: host
> >   if-modified-since: *
> >
> > and the responses would like:
> >
> >   :status: 304
> >   last-modified: Thu, 27 Nov 2014 20:59:51 GMT
> >
> > If there is concern that a date of * might cause errors on some origin
> > servers, then perhaps a new header when-modified could be used?
> >
> > thoughts?
> >
> >
> >
> >
> > --
> > Greg Wilkins <gregw@intalio.com>  @  Webtide - an Intalio subsidiary
> > http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that
> scales
> > http://www.webtide.com  advice and support for jetty and cometd.
>



-- 
Greg Wilkins <gregw@intalio.com>  @  Webtide - *an Intalio subsidiary*
http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that scales
http://www.webtide.com  advice and support for jetty and cometd.
Received on Sunday, 30 November 2014 21:40:51 UTC