Re: draft-asilvas-http-push-assets-00 comments from Aaron L. Silvas on 2016-07-13 (ietf-http-wg@w3.org from July to September 2016)

From: Aaron L. Silvas <asilvas@godaddy.com>
Date: Wed, 13 Jul 2016 20:01:59 +0000
To: Alcides Viamontes E <alcidesv@zunzun.se>
CC: Martin Thomson <martin.thomson@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CY1PR0201MB159412923C9DB7F815290AC7B2310@CY1PR0201MB1594.namprd02.prod.outlook.>
This draft aims to make HTTP/2 Server Push ready for prime time.


Sure, I'll take a stab at a basic example. Excuse the formatting...


* Request - 1st visit

GET /
  Push-Assets: *
* PUSH_PROMISE - Promises sent by server
GET /shared.js
GET /shared.css
* Server Push
GET /shared.js
  Status: 200 OK
  Push-Asset-Key: /shared.js
  Push-Asset-Match: *
  ETag: abc123
GET /shared.css
  Status: 200 OK
  Push-Asset-Key: /shared.css
  Push-Asset-Match: *
  ETag: 123abc
  Cache-Control: public, max-age:123456
* Response
GET /
  Status: 200 OK

(navigate to another page)


* Request - page transition

GET /page2
  Push-Assets: md5(/shared.js)=etag(abc123);md5(/shared.css)=no-push
* Server Push
GET /shared.js
  Status: 304 NotModified
  Push-Asset-Key: /shared.js
  Push-Asset-Match: *
  ETag: abc123
* Response
GET /page2
  Status: 200 OK

As you can see, the benefits of providing server-state can be quite substantial. In the case of "blind" server push, where a server doesn't care about client state and just sends everything, everytime, you'll get a lot of extra unpredictable waste, as the client will end up sending many RESETS to cancel in-flight server pushes. This is forcing those that want to use Server Push to use other means of managing state, namely cookies, and it's hacky at best.


Regarding (potential) tradeoffs with the other major alternative being proposed (Cache Digest), I'll copy/paste a response I gave to this question in another email:

-------------

Until we can test the two approaches side-by-side in real-world scenarios with tons of samples, it's hard (for me) to predict which is "better". But here are some tradeoffs:

* Cache Digest uses considerably fewer bytes per cached resource - significance of delta unknown
* Push-Assets is presumably simpler for client & server integrators, as it leverages HTTP Headers - potentially broader adoption
* Push-Assets can be URI "bucketed" to prevent wasted bytes over the wire for large and complex websites - potential big benefits for multi-app-per-domain websites, CDN's, etc
* Push-Assets can be tuned by the apps developers to only enable server-push for assets that make the most sense (for example, an app may decide that js & css should be pushed but images should not) - Cache Digest sends all state regardless if "push-enabled"
* Push-Assets can work for non-HTML resources - Potentially optimizing all HTTP traffic, such as API's, or more real-world scenarios like CSS with imports

Ultimately I believe both would be *very* positive solutions, as most visitors visit websites with few previously cached resources, and may greatly benefit from Server Push. A recent study was posted in this group which illustrates the point better than I can: https://if-report.shimmercat.com/dirhtml/

-------------




-aaron

________________________________
From: Alcides Viamontes E <alcidesv@zunzun.se>
Sent: Wednesday, July 13, 2016 10:16:24 AM
To: Aaron L. Silvas
Cc: Martin Thomson; HTTP Working Group
Subject: Re: draft-asilvas-http-push-assets-00 comments

Hello Aaron,

I'm also very interested on understanding your proposal better. I have read your draft and then have followed the questions asked on this list  and now your clarifications. However, neither your examples in the draft nor your latest email have helped me. Could you please repeat your examples with full request/response headers and with annotations about which requests are part of a PUSH_PROMISE?

You mention that "browser and server adhere to a strict dependency state contract".... can you describe this contract?  Compared with the situation today, what are the major differences?

Thanks in advance,

./Alcides.



On Wed, Jul 13, 2016 at 6:24 PM, Aaron L. Silvas <asilvas@godaddy.com<mailto:asilvas@godaddy.com>> wrote:

Thanks for the feedback, Martin. I agree the document needs work to better clarify things.


I'll attempt to address your comments here, in hopes of striking further conversation on the topic.


"Push-Assets" is the only request header; required only if the client wishes to enable the full HTTP/2 Push-Assets flow as outlined in the draft. If the server does not support/understand the header, it is benign. This allows the client to inform the server of its cache state, for push-enabled assets only (unlike Cache Digest HTTP/2 proposal which sends everything). This header includes the exact state of each of these resources, as if they were individually requested, and thus supports existing etag and last-modified headers. Not only will the server know what resources the client does and does not have, but it will also know which resources are simply out of date and must still be pushed. The server won't even need to send a 304 (Server Push) response for unmodified resources, as the server knows the state of the clients push-enabled assets, and the client can assume "no change" if Server Push is performed on the given resources. This effectively means that the server will only ever send what is missing or changed, no more, no less.


Example (requests only to keep length of email to a minimum):


  GET /page1

  Push-Assets: *


  GET /page2
  Push-Assets: md5(shared-resource1.js)=etag(123456)


"Push-Asset-Key" is an optional response header. It allows the server to "name" a resource, allowing it to renamed at a later time without worry of having to refetch unnecessarily. By default, the "key" of every resource is the URI Path, minus any querystring parameters.


"Push-Asset-Key" is also a required PUSH_PROMISE header, which is likely part of the confusion. Being a PUSH_PROMISE is essentially the server delivering a request on its behalf, this header field informs the client that this resource should be tracked as a "Push-Asset" (aka push-enabled). The key itself is what uniquely identifies the resource, and will typically be the URI Path of the resource, minus querystring parameters, but in MD5 form. The client will only ever provide client cache state of resources that have responded with this header field, as they are "push-enabled". This gives the server control of what state it should or should not track for the purpose of Server Push resources.


"Push-Asset-Match" is an optional response header. This effectively allows the server to inform the client that a given resource is only used within specific "buckets" of matching URI's. This is especially useful for large or complex domains, such as CDN's, or other multi-app-per-domain scenarios.



I'll continue to collect feedback, and especially suggestions, and update the next draft accordingly. Thanks again for the interest.




-aaron

________________________________
From: Martin Thomson <martin.thomson@gmail.com<mailto:martin.thomson@gmail.com>>
Sent: Tuesday, July 12, 2016 5:52:49 PM
To: HTTP Working Group; Aaron L. Silvas
Subject: draft-asilvas-http-push-assets-00 comments

First, I think that there is an interesting idea hidden in here.  It
could be that it's complementary to the more generic digests idea.

However, I found it impossible to determine how this document is
claiming to achieve its stated goals.  None of the examples include
header fields, which would have gone a long way to explaining this.
The new header fields don't really say what each is used for.  That
leaves me guessing about how this fits together.

Here's my best guess, though I have to confess that I can't connect
this to what Section 4 says:

On request N.  A server provides a new header field with responses
that create a secondary identifier for resources.  I'm really guessing
here, but I assume that unlike etag, this header field includes a
value that is the same for a group of resources.

On request >N. Clients include a new header field with requests that
controls what is pushed.  If it includes '*', then everything is
pushed.  If it includes 'no-push', then nothing is pushed.  If it
includes a list of these new push-asset-keys, then anything matching
those keys is not pushed.

Based on this, I'm fairly certain that I don't understand the
proposal, because this design doesn't require both Push-Asset-Key and
Push-Asset-Match header fields.  I'm clearly missing something.

I did start to look at the code, but without a better overview of what
it aims to achieve, I'm afraid that I'm not going to get much from it.



--
Alcides Viamontes
www.shimmercat.com<http://www.shimmercat.com>
Received on Wednesday, 13 July 2016 20:02:33 UTC