Re: Making Implicit C-E work. from Roberto Peon on 2014-04-30 (ietf-http-wg@w3.org from April to June 2014)

From: Roberto Peon <grmocg@gmail.com>
Date: Wed, 30 Apr 2014 03:26:54 -0700
To: Roland Zink <roland@zinks.de>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAP+FsNe++2F0WMttisRbA1_JVFU6WJuGL8GX+Jk0H62jSi6ZbA@mail.gmail.com>
On Wed, Apr 30, 2014 at 2:35 AM, Roland Zink <roland@zinks.de> wrote:

>  On 30.04.2014 08:44, Roberto Peon wrote:
>
> >>> As I described it, its use by the originator of an entity is not
> mandated, instead behaviors are mandated of recipients when it IS used.
>
>>  >>>
>> >>
>> >> Yeah, mandating it. Which I'm not happy about.
>> >
>> > Mandates support, not use.
>> >
>>
>> Kind of the same thing, from the client's POV. Server's choice.
>>
>
>    For C-E this means for example the server decides if the client can do
> a seek. Some interactive clients would prefer to do seeks over getting the
> content compressed. Whereas downloaders would prefer the content to be
> compressed. T-E would allow to have both seek and compression.
>

The server *always* decides what to send, whether c-e or t-e is used. The
fact that the server *may* use gzip does not *require* it to use gzip, and
with my proposal, the server knows if the client requested it explicitly or
not, and it certainly can see if there is a range request and make the
appropriate response.

T-E is theoretically wonderful if one ignores real deployments in today's
world where the majority of HTTP/1.X servers don't actually do
transfer-encoding: gzip, and thus HTTP2 gateways would have to do c-e to
t-e translation (which might be rather error prone in its own way) or have
to bear the expense of doing the compression themselves-- something which
is untenable. This ignores the the security issue of knowing when t-e is
safe, which I'll address again below.



>
>   And today it is often neither the server nor the client's choice, which
> is what is causing the pain. The client expresses that it wants gzip. The
> intermediary doesn't do it because it makes numbers better, increases
> throughput, or because they're too lazy to implement it., all at the cost
> of the decreased user experience.
>
>
>> <snip>
>> >
>> > The combination of intermediaries stripping a-e plus the competitive
>> driver to deliver good experience/latency is causing interop failures today
>> where servers will send gzip'd data whether or not the client declares
>> support in a-e.
>> >
>>
>> Wait, you're saying the whole motivator here is that servers don't comply
>> with the protocol? So you're changing the protocol to accommodate them?
>> That does not feel right to me, at all; it's not just blessing a potential
>> misuse of C-E, it's wallpapering over a flat out abuse.
>>
> Partially.
> I'm saying that intermediaries are doing things which are incenting
> implementors to break compatibility with the spec, and that implementors
> are doing so because it makes the users happy.
> In the end, making the users happy is what matters, both commercially and
> privately. The users really don't care about purity, and will migrate to
> implementations that give them good/better user experience.
>
>  But even so, why do you have to fix it in HTTP/2? And why does it hurt
>> h2 to *not* fix it?
>>
>
>  Compression is an important part of making latency decrease/performance
> increase, and, frankly, there is little practical motivation to deploy
> HTTP/2 if it doesn't succeed in reducing latency/increase performance.
> Success isn't (or shouldn't be) defined as completing a protocol spec, but
> rather, getting an interoperable protocol deployed. If it doesn't get
> deployed, the effort is wasted. If it doesn't solve real problems, the
> effort is wasted.
>
>  In any case, I cannot reliably deploy a T-e based compression solution.
> T-e based compression costs too much CPU, especially as compared with c-e
> where one simply compresses any static entity once and decompresses (which
> is cheap) as necessary at the gateway.
>
> If it is really T-E you can do the same compression of static entities
> when the whole file is delivered, it would be different for range requests
> or the frame based approach.
>

Now how we've thusfar spec'd it.


>   T-e based compression isn't as performant in terms of
> compression/deflation ratios.
>
> Don't think this is true, the same bytes can be sent as either T-E or C-E.
> For the frame based approach some numbers were given.
>

The same bytes can't be sent in both, unless the we're willing to suffer
vastly increased DoS surface area and memory usage OR we do the frame-based
approach, which will have marginally worse compression.

>   Many deployed clients/servers wouldn't correctly support it.
>
> There are no deployed HTTP2 clients or servers, or are there some?
>

There are, but I'm not talking about those. My problem is dealing with the
rest of the world, which is mostly HTTP/1.X and is unlikely to rapidly
change.
In other words, I'm concerned mainly with HTTP/1.X clients and especially
servers.

>   T-e would require that any gateway acting as a loadbalancer/reverse
> proxy would either need to know which resources it could compress,  or
> forces us to not use compression.
>
> The gateway can forward the compressed content unmodified. The gateway is
> only forced to do something if either the server or the client doesn't
> support compression.
>

The gateway cannot know which resources it is safe to compress without
something outside the protocol. Compression via t-e without knowing whether
it is safe or not allows attackers to discern ostensibly secret
information. This is NOT acceptable.

>   Knowing what resources to compress either requires an oracle, or
> requires content authors to change how they author content (*really* not
> likely to happen),
>
>    Not sure that authors want to know about compression. If it is
> automatic then this would be fine. Currently there is server configuration,
> for example zlib.output_compression in php.ini, and the possibility to do
> this in the content, for example in PHP something like
> ob_start('ob_gzhandler'). I guess there is a lot more authors are not aware
> off.
>

We definitely don't want to cause content authors to lose what little
control (and understanding) they have today, especially over matters
touching security like compression.
In general, if a resource wasn't compressed on output from an endpoint, it
shouldn't be when received by any other endpoint.
Given the necessity of interfacing with HTTP/1 servers, which rarely
support T-E: gzip, this ends up being a problem for HTTP/2 and T-e: gzip.


>    <snip>
>>
>> > The proxy, when forwarding the server's response to the HTTP/1 client,
>> must ensure that the data is uncompressed when forwarding to the HTTP/1
>> client since the client didn't ask for c-e gzip.
>> >
>>
>> Cache-Control:no-transform explicitly forbids the proxy from altering the
>> representation. It's not allowed to decompress it.
>>
> In fact what we're doing is offering two representations simultaneously.
>
>
> Don't think this will really work. Only one can delivered and translating
> between them (for example range) seems to be difficult.
>

Why is range a problem? If a server wishes to service a range request, it
need not compress the output.
With the proposal, the server will be able to know if the client requested
gzip explicitly or not and make the correct decision w.r.t. what to output.
This is the same as it is today with HTTP/1.

-=R
Received on Wednesday, 30 April 2014 10:27:22 UTC