Re: Making Implicit C-E work. from Roberto Peon on 2014-05-02 (ietf-http-wg@w3.org from April to June 2014)

From: Roberto Peon <grmocg@gmail.com>
Date: Fri, 2 May 2014 15:31:23 -0700
To: K.Morgan@iaea.org
Cc: Matthew Kerwin <matthew@kerwin.net.au>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAP+FsNcO46XKuLgpXQK=GhNFSJvrJpvt4Q_DkjzAMkGsMcTarQ@mail.gmail.com>
On Thu, May 1, 2014 at 2:57 PM, <K.Morgan@iaea.org> wrote:

> In many different e-mails grmocg@gmail.com wrote:
> >
>
> > I have a real honest-to-goodness pragmatic deployment problem (myriad
> pre-existing servers/clients
>
> > whose deployment I do not and cannot control) here that I cannot wish
> away ...
>
> >
>
> > Many of my customers will not be writing custom servers, and as such to
> be deployable,
>
> > we need solutions that will work with what is out there. ...
> >
>
> > My problem is dealing with the rest of the world, which is mostly
> HTTP/1.X and is unlikely to rapidly change.
>
> > In other words, I'm concerned mainly with HTTP/1.X clients and
> especially servers. ...
>
>
>
> I  honestly want to understand your problem, but I don't.  It might be
> really helpful if you could write down the hypothetical network connections
> between client and server (including as many potential intermediaries as
> you know about) and denote either HTTP/1.x or HTTP/2 for every hop.
>



  Browser <-h2-> Virus Scanning Software <-h2.> Corporate Proxy <-h2->
 Reverse Proxy/Loadbalancer <-h1-> Reverse Proxy/Loadbalancer(2) <-h1->
Servers
  The fat+short pipe starts after the first 'Reverse Proxy' here.

Other things that definitely happen:

  Browser <-h1-> Virus Scanning Software <-h1.> Corporate Proxy <-h1->
 Reverse Proxy/Loadbalancer <-h2-> Reverse Proxy/Loadbalancer(2) <-h1->
Servers
  The fat+short pipe starts after the *second* 'Reverse Proxy' here.


I'm sure there are myriad other things that happen here, that I'm not
listing.


>
>
> I've been sitting here trying to understand how implicit C-E gzip helps
> based on what you've already said and I still don't get it.
>
>
>
> You said you are concerned especially about HTTP/1.X servers.  But your
> proposal can't help because the 1.X servers won't know about the implicit
> gzip and certainly won't know to add the "uncompressed-*" headers.
>


This is assuming that those parts aren't delegated to the loadbalancer,
which I see often.
The server ultimately always controls if the content could have ever been
compressed (the loadbalancer never compresses).
In other words, there are architectural simplifications that can be made
when the server delegate certain responsibilities to a loadbalancer/proxy.
The choice of whether or not compression is allowed/used is always up to
the server.

Lets ignore that, though.
To support t-e: gzip, one must change the server framework, not simply
business logic.


>
>
>
> In the opposite direction, if the server is HTTP/2 and the client is
> HTTP/1.X, the body will have to be decompressed anyway so you only benefit
> for the hops that are HTTP/2.  Is that really that beneficial?
>
>
huh?


>
>
> Between an HTTP/2 client and an HTTP/2 server, we can already do something
> about it without breaking things i.e. some sort of transfer coding.
>
>
>
Agreed, but that isn't the use-case that I'm worried about. I have myriad
servers which I don't control which I won't be able to make changes to.


>
>
>
> > The same bytes can't be sent in both [C-E and T-E], unless the we're
> willing to suffer vastly increased DoS surface area and memory usage [for
> T-E] ...
>
> I don't buy the "vastly increased DoS surface area" argument. The
> receivers already have this "DoS surface area" for decompressing C-E gzip
> entities.  If the receiver of a stream has a gzip decompression context for
> decompressing a C-E gzip entity OR a T-E gzip message, they still just have
> one context.  If a receiver has N outstanding streams it has N outstanding
> gzip contexts, even if they're all C-E gzip.  Sure stupid servers could do
> both C-E and T-E, but it's near impossible to design a protocol that
> prevents implementors from doing every stupid thing imaginable.
>
> So, yes the same bytes could be sent in both cases (C-E or T-E) and it's
> actually probably the right way to do T-E (as opposed to the
> frame-by-frame).
>
>
This part of the discussion is in the other thread. I'll leave it there :)


>
>
>
> > I cannot reliably deploy a T-e based compression solution.
> > T-e based compression costs too much CPU, especially as compared with
> c-e where one simply
> > compresses any static entity once and decompresses (which is cheap) as
> necessary at the gateway.
>
> You're comparing apples and oranges. Dynamic gzip has the same cpu cost
> whether it's called C-E or T-E. For static entities, as I pointed out
> above, you can pre-compress once and send the same bytes as either C-E or
> T-E.
>

The issue isn't just the cost, but *where* the cost is born.
The cost should not be born by the http2 gateway in the HTTP1 server ->
client direction.
Additionally, with c-e, remember that for static resources, the compression
cost is incurred only once, not once for every request.


>
> > T-e based compression isn't as performant in terms of
> compression/deflation ratios.
>
> LOL. I assume you could only possibly be talking about frame-by-frame T-E.
> Ironically, in another thread you said the performance difference for
> frame-by-frame is negligible (and we sent out data to back that up).
>

correct. And it isn't a big difference, but it is there. The difference
gets bigger the smaller the framesize. Note that I prefer frame-based t-e
to segment based t-e, and we're discussing that in another thread.


>
> > Many deployed clients/servers wouldn't correctly support it.
>
> I know of no clients/servers that *incorrectly* support T-E.  I assume you
> just mean wouldn't support it at all?
>
>
Correct. Most clients servers that I know of don't support anything other
than t-e: chunked


> > T-e would require that any gateway acting as a loadbalancer/reverse
> proxy would either need to know which
> > resources it could compress, or forces us to not use compression.
> Knowing what resources to compress either
> > requires an oracle, or requires content authors to change how they
> author content (*really* not likely to happen),
>
>
>
> How do they solve the same problem with C-E?  The server decides, right?
>  Same thing for T-E.
>

Correct, except that in current deployments servers know about c-e and not
about t-e. Thus, an http2 gateway can be deployed which is able to serve
compressed resources with c-e, but not with t-e thanks to the lack of
server support for t-e.


>
>
>
>
>
> > [With implicit C-E gzip] [a]n HTTP/1 client still receives an
> uncompressed entity if it didn't request it compressed.
>
> What are _HTTP/2_ clients supposed to do to that really want an
> uncompressed (identity) entity?
>
> Accept-Encoding: identity
>
> Implicit-Accept-Encoding: pretty please don't use gzip
>

They're supposed to support  being able to un-gzip.


>
>
>
>
>
> In addition to everything above, I noticed that you still haven't
> responded to concerns that Matthew & Julian brought up. It would be good
> for you to update your proposal to address these concerns...
>
>
>
> - Matthew wrote: "[What about] last-modified? And, possibly, other fields
> from Vary? There's more entity-specific metadata than just ETag."
>
>
>
The rule is simple ":uncomressed-foo" replaces "foo" if decompression is
done.


>
> - Matthew also wrote: "Cache-Control:no-transform explicitly forbids the
> proxy from altering the representation. It's not allowed to decompress it."
>

I've addressed this. It isn't altering the representation if it is
effectively transmitting both.


>
>
>
> - And elsewhere Julian brought up the issue of entity-specific metadata in
> the body itself. He used the example of WebDav. Do you expect
> intermediaries to parse through bodies and patch up that metadata too?
>
>
No. I'd expect the origin to not produce the :uncompressed-foo" fields if
this caused problems.


>
>
>
>
> This email message is intended only for the use of the named recipient.
> Information contained in this email message and its attachments may be
> privileged, confidential and protected from disclosure. If you are not the
> intended recipient, please do not read, copy, use or disclose this
> communication to others. Also please notify the sender by replying to this
> message and then delete it from your system.
>


-=R
Received on Friday, 2 May 2014 22:31:51 UTC