Re: draft-asilvas-http-push-assets-00 comments from Martin Thomson on 2016-07-14 (ietf-http-wg@w3.org from July to September 2016)

From: Martin Thomson <martin.thomson@gmail.com>
Date: Thu, 14 Jul 2016 10:01:47 +1000
To: "Aaron L. Silvas" <asilvas@godaddy.com>
Cc: Alcides Viamontes E <alcidesv@zunzun.se>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABkgnnVXc04ih+StFhKEv7PYe5ygjRVDonCbSOgws2cnZYtNaw@mail.gmail.com>
Thanks for putting this (and the previous email) together.  That helps
clarify a great deal.

Since I don't have a lot of time before I travel, I'll be brief.

I'm very concerned about request size and how this scales.

There could be something good in this, but it might be a subset of
what you have defined.  If for no other reason than the one that you
identified: this might be easier to understand and implement.  I don't
see it as an advantage over cache digests, which are designed to be
hands-off from an operational perspective.  My concern is that this
could require some hands-on from a server operator to get the most out
of it.  In particular, the bucketing you refer to would seem to
require tweaking unless you had a whole lot of very smart heuristics
in your server code.

On 14 July 2016 at 06:01, Aaron L. Silvas <asilvas@godaddy.com> wrote:
> This draft aims to make HTTP/2 Server Push ready for prime time.
>
>
> Sure, I'll take a stab at a basic example. Excuse the formatting...
>
>
> * Request - 1st visit
>
> GET /
>   Push-Assets: *
> * PUSH_PROMISE - Promises sent by server
> GET /shared.js
> GET /shared.css
> * Server Push
> GET /shared.js
>   Status: 200 OK
>   Push-Asset-Key: /shared.js
>   Push-Asset-Match: *
>   ETag: abc123
> GET /shared.css
>   Status: 200 OK
>   Push-Asset-Key: /shared.css
>   Push-Asset-Match: *
>   ETag: 123abc
>   Cache-Control: public, max-age:123456
> * Response
> GET /
>   Status: 200 OK
>
> (navigate to another page)
>
> * Request - page transition
>
> GET /page2
>   Push-Assets: md5(/shared.js)=etag(abc123);md5(/shared.css)=no-push
> * Server Push
> GET /shared.js
>   Status: 304 NotModified
>   Push-Asset-Key: /shared.js
>   Push-Asset-Match: *
>   ETag: abc123
> * Response
> GET /page2
>   Status: 200 OK
>
> As you can see, the benefits of providing server-state can be quite
> substantial. In the case of "blind" server push, where a server doesn't care
> about client state and just sends everything, everytime, you'll get a lot of
> extra unpredictable waste, as the client will end up sending many RESETS to
> cancel in-flight server pushes. This is forcing those that want to use
> Server Push to use other means of managing state, namely cookies, and it's
> hacky at best.
>
>
> Regarding (potential) tradeoffs with the other major alternative being
> proposed (Cache Digest), I'll copy/paste a response I gave to this question
> in another email:
>
> -------------
>
> Until we can test the two approaches side-by-side in real-world scenarios
> with tons of samples, it's hard (for me) to predict which is "better". But
> here are some tradeoffs:
>
> * Cache Digest uses considerably fewer bytes per cached resource -
> significance of delta unknown
> * Push-Assets is presumably simpler for client & server integrators, as it
> leverages HTTP Headers - potentially broader adoption
> * Push-Assets can be URI "bucketed" to prevent wasted bytes over the wire
> for large and complex websites - potential big benefits for
> multi-app-per-domain websites, CDN's, etc
> * Push-Assets can be tuned by the apps developers to only enable server-push
> for assets that make the most sense (for example, an app may decide that js
> & css should be pushed but images should not) - Cache Digest sends all state
> regardless if "push-enabled"
> * Push-Assets can work for non-HTML resources - Potentially optimizing all
> HTTP traffic, such as API's, or more real-world scenarios like CSS with
> imports
>
> Ultimately I believe both would be *very* positive solutions, as most
> visitors visit websites with few previously cached resources, and may
> greatly benefit from Server Push. A recent study was posted in this group
> which illustrates the point better than I can:
> https://if-report.shimmercat.com/dirhtml/
>
> -------------
>
>
>
>
> -aaron
>
> ________________________________
> From: Alcides Viamontes E <alcidesv@zunzun.se>
> Sent: Wednesday, July 13, 2016 10:16:24 AM
> To: Aaron L. Silvas
> Cc: Martin Thomson; HTTP Working Group
> Subject: Re: draft-asilvas-http-push-assets-00 comments
>
> Hello Aaron,
>
> I'm also very interested on understanding your proposal better. I have read
> your draft and then have followed the questions asked on this list  and now
> your clarifications. However, neither your examples in the draft nor your
> latest email have helped me. Could you please repeat your examples with full
> request/response headers and with annotations about which requests are part
> of a PUSH_PROMISE?
>
> You mention that "browser and server adhere to a strict dependency state
> contract".... can you describe this contract?  Compared with the situation
> today, what are the major differences?
>
> Thanks in advance,
>
> ./Alcides.
>
>
>
> On Wed, Jul 13, 2016 at 6:24 PM, Aaron L. Silvas <asilvas@godaddy.com>
> wrote:
>>
>> Thanks for the feedback, Martin. I agree the document needs work to better
>> clarify things.
>>
>>
>> I'll attempt to address your comments here, in hopes of striking further
>> conversation on the topic.
>>
>>
>> "Push-Assets" is the only request header; required only if the client
>> wishes to enable the full HTTP/2 Push-Assets flow as outlined in the draft.
>> If the server does not support/understand the header, it is benign. This
>> allows the client to inform the server of its cache state, for push-enabled
>> assets only (unlike Cache Digest HTTP/2 proposal which sends everything).
>> This header includes the exact state of each of these resources, as if they
>> were individually requested, and thus supports existing etag and
>> last-modified headers. Not only will the server know what resources the
>> client does and does not have, but it will also know which resources are
>> simply out of date and must still be pushed. The server won't even need to
>> send a 304 (Server Push) response for unmodified resources, as the server
>> knows the state of the clients push-enabled assets, and the client can
>> assume "no change" if Server Push is performed on the given resources. This
>> effectively means that the server will only ever send what is missing or
>> changed, no more, no less.
>>
>>
>> Example (requests only to keep length of email to a minimum):
>>
>>
>>   GET /page1
>>
>>   Push-Assets: *
>>
>>
>>   GET /page2
>>   Push-Assets: md5(shared-resource1.js)=etag(123456)
>>
>>
>> "Push-Asset-Key" is an optional response header. It allows the server to
>> "name" a resource, allowing it to renamed at a later time without worry of
>> having to refetch unnecessarily. By default, the "key" of every resource is
>> the URI Path, minus any querystring parameters.
>>
>>
>> "Push-Asset-Key" is also a required PUSH_PROMISE header, which is likely
>> part of the confusion. Being a PUSH_PROMISE is essentially the server
>> delivering a request on its behalf, this header field informs the client
>> that this resource should be tracked as a "Push-Asset" (aka push-enabled).
>> The key itself is what uniquely identifies the resource, and will typically
>> be the URI Path of the resource, minus querystring parameters, but in MD5
>> form. The client will only ever provide client cache state of resources that
>> have responded with this header field, as they are "push-enabled". This
>> gives the server control of what state it should or should not track for the
>> purpose of Server Push resources.
>>
>>
>> "Push-Asset-Match" is an optional response header. This effectively allows
>> the server to inform the client that a given resource is only used within
>> specific "buckets" of matching URI's. This is especially useful for large or
>> complex domains, such as CDN's, or other multi-app-per-domain scenarios.
>>
>>
>>
>> I'll continue to collect feedback, and especially suggestions, and update
>> the next draft accordingly. Thanks again for the interest.
>>
>>
>>
>>
>> -aaron
>>
>> ________________________________
>> From: Martin Thomson <martin.thomson@gmail.com>
>> Sent: Tuesday, July 12, 2016 5:52:49 PM
>> To: HTTP Working Group; Aaron L. Silvas
>> Subject: draft-asilvas-http-push-assets-00 comments
>>
>> First, I think that there is an interesting idea hidden in here.  It
>> could be that it's complementary to the more generic digests idea.
>>
>> However, I found it impossible to determine how this document is
>> claiming to achieve its stated goals.  None of the examples include
>> header fields, which would have gone a long way to explaining this.
>> The new header fields don't really say what each is used for.  That
>> leaves me guessing about how this fits together.
>>
>> Here's my best guess, though I have to confess that I can't connect
>> this to what Section 4 says:
>>
>> On request N.  A server provides a new header field with responses
>> that create a secondary identifier for resources.  I'm really guessing
>> here, but I assume that unlike etag, this header field includes a
>> value that is the same for a group of resources.
>>
>> On request >N. Clients include a new header field with requests that
>> controls what is pushed.  If it includes '*', then everything is
>> pushed.  If it includes 'no-push', then nothing is pushed.  If it
>> includes a list of these new push-asset-keys, then anything matching
>> those keys is not pushed.
>>
>> Based on this, I'm fairly certain that I don't understand the
>> proposal, because this design doesn't require both Push-Asset-Key and
>> Push-Asset-Match header fields.  I'm clearly missing something.
>>
>> I did start to look at the code, but without a better overview of what
>> it aims to achieve, I'm afraid that I'm not going to get much from it.
>
>
>
>
> --
> Alcides Viamontes
> www.shimmercat.com
Received on Thursday, 14 July 2016 00:02:16 UTC