Re: Submitted new I-D: Cache Digests for HTTP/2 from Eliezer Croitoru on 2016-02-02 (ietf-http-wg@w3.org from January to March 2016)

From: Eliezer Croitoru <eliezer@ngtech.co.il>
Date: Tue, 02 Feb 2016 08:37:30 +0200
To: Kazuho Oku <kazuhooku@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-id: <56B04EAA.5010407@ngtech.co.il>
Thanks for the response!!
Your words made more sense to me now then before.

Eliezer

On 02/02/2016 03:45, Kazuho Oku wrote:
> Thank you for your feedback.
>
> 2016-01-27 11:45 GMT+09:00 Eliezer Croitoru <eliezer@ngtech.co.il>:
>> I would like to join from this point to understand and ask the list since I
>> couldn't follow and understand what was proposed and implemented exactly and
>> I wanted to make sure I understand right.
>>
>> On 22/01/2016 19:47, Richard Bradbury wrote:
>>>
>>> Hello. The general thrust of this I-D seems like a useful optimisation
>>> of HTTP/2 server push. It is wasteful to push a representation to a
>>> client when the client already has a fresh copy cached. But the reverse
>>> is equally true, I think...
>>
>>
>> In some relation to the above quote I would like to ask:
>> What is basically more important the client or the server resources?
>>
>>  From what I understood the basic proposal was to add into every request the
>> cache digest am I right? Is it still that way?
>
> The original draft adds cache digest to every H2 connection.  Recent
> discussion has been about conveying the digest within every HTTP
> request as an HTTP header.
>
>> Else then some privacy issues about sending the client cache-digest and TLS
>> as being considered secure, there are other issues with it, for example
>> mobile clients or metered WAN and LAN connections.
>> If the client sends some KB(which can be more then couple cookies) on each
>> request it means that for 20 requests the usage will be 10KB*20 <> 200KB
>> which can become an issue for some but not all clients.
>
> In case of HTTP/2, the overhead will be much less thanks to HPACK.
> With HPACK, cache-digest that is sent repeatedly can typically be
> compressed to one or two octets.  And unless HTTP/2 is being used,
> there is practically no reason to send the cache digest over a public
> network; only HTTP/2 supports push.
>
>> Maybe for youtube that sends files\objects ranging from 3MB to 500MB++ it's
>> not always an issue but sites that sends\pushes X*3MB images for the
>> homepage to a mobile app is kind of an issue. If I'm not wrong this is one
>> of the reasons that mod_pagespeed was designed, to somehow solve wrongly
>> consumed bandwidth.
>>
>>  From my point of view and understanding a cache-digest will probably require
>> some per client "cache-digest dictionary" which can cause some issues to
>> systems\servers with lots of clients\connections. The other side would be
>> the ongoing re-validation and maintenance of this dictionaries.
>
> That is a fair argument.  However, servers are already required to
> maintain such dictionary for HTTP/2 (i.e. HPACK).
>
>> It opens both the clients and the servers to some vulnerabilities. Also what
>> would be the scope of the cache-digest, per connection? per request? per
>> some client session id?
>>
>> And to polish out some aspects, what would happen if the server(which in
>> many cases doesn't care about couple KB on the wire) will send a push offer
>> for 20 objects and will be declined for each and every one of them with some
>> kind of 304 by the client?
>> - It will not require to open a new connection to the client and will use
>> the same open connection.
>> - It will not create a situation which the client resources(non-symmetric
>> DSL clients) are being exhausted(imagine an office with 100+ PCs and 2 DSL
>> 15MBit\1Mbit connection..)
>> - It will simplify the server SW implementation and will prevent the need to
>> store and look-up the client "cahce-digest dictionary" each and every time.
>>
>> And also if the html page contains the list of urls for objects that the
>> client\browser can validate by itself someway, why do the client needs to be
>> pushed some objects\content?(this is yet to be fully understood to me)
>> I am looking for couple scenarios which will justify and clear out the need
>> for such an implementation. Where is it needed else then advertisements?
>> My basic understanding is that a cache-digest doesn't help for interactive
>> applications or chats or real-time applications since the content there is
>> always new or updated compared to the client. And compared to these a static
>> files site will maybe require the client to send once the cache-digest but
>> not on each and every request.
>>
>> I am almost convinced that:
>> - Implementing a special request for an "update" request to a specific
>> set\batch of files\objects will be much more efficient for both the client
>> and the server then sending the cache-digest even once in a header.
>> - Using some kind of push\offer 20 objects and being declined by the client
>> would be much better then publishing the list of existing objects by the
>> client.
>
> Such approach is already defined as part of HTTP/2.
>
> By using server-push defined in HTTP/2, it is possible for a server to
> start sending resources that are expected to be used by the client.
> However, the issue is that aggressively doing so wastes downstream
> bandwidth, since without knowing what is already cached by the client
> a server will repeatedly try to push the same objects (that are
> rejected every time by the client after it receives the pushed
> resource).
>
> This draft is an attempt to fix the problem, by eliminating the
> bandwidth you would waste if you push blindly, at the cost of some
> upstream bandwidth.
>
>> - For a client that doesn't care to send the header for 20 objects it would
>> be pointless to not send if-modified-X requests for each and every one of
>> these objects as an entity.
>> - There are some security risks in the client sending a cache-digest in a
>> specific scope which I would like to read about.
>>
>> Thanks,
>> Eliezer
>>
>>
>
>
>
Received on Tuesday, 2 February 2016 06:38:06 UTC