Re: Stuck in a train -- reading HTTP/2 draft. from Greg Wilkins on 2014-06-18 (ietf-http-wg@w3.org from April to June 2014)

From: Greg Wilkins <gregw@intalio.com>
Date: Wed, 18 Jun 2014 09:39:35 +0200
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAH_y2NGJXBDk4WAnp6BqKEq45qqUHcZ8OXGAjK-yi8Z2_8MR3Q@mail.gmail.com>
On 17 June 2014 23:04, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:

> In message <
> CAH_y2NHRz6khc61+c-7jDQJK56PayO_3Q4ExVOM_Yjmhr_n3Lw@mail.gmail.com>
> , Greg Wilkins writes:
>
>

The way I have understood it, CONTINUATION makes sense if you are going to
make END_STREAM a function of frame type.   Making END_STREAM a function of
frame type is a consequence of supporting arbitrarily large headers that
are compressed with a shared state table.   A compression state table gives
very good compression and is needed if you are going to fit 80+ requests
into a single slow start TCP window.....   why the compression state table
has to be shared???  Well I think that was a simplification that has
resulted in a very complex emergent design.



>> > For obvious DoS reasons, it's a mistake to not apply flowcontrol
> >> > to HEADERS.
> >
> >Indeed.  However I do not think that strict DoS is the main concern
> >here.    Instead I believe there will eventually be collusions between
> >applications, clients and servers to use the lack of flow control to
> >attempt to gain benefits vs other traffic on shared connections.
>
> That's sort of comparing grand bank robbery to petty candy theft...
>

I don't think so.   Sure full on DOS attacks get the all the media
attention just as grand bank robberies do,  but it is the taking of a
little bit more candy from every server connection that will really cost
deployers serious real money.   For example, the amount of real value taken
my users (jetty server) by the decision to break the two connection limit
far exceeds any losses made by the occasional real DOS.

Creating an incentive to send large headers is just going to put more and
more pressure of servers to support large headers.  We get away with 8KB
limits mostly now, but even a modest increase to 16KB represents a massive
investment in memory when you multiple by all the outstanding HTTP
requests.  How many billions of requests are open this instant, multiply
that by 8KB or more and that is a significant extra memory that we are
asking the world to purchase!

Sure servers can reject big headers,but the incentive will now exists to
push that limit and we have seen with the 2 connection limit that such
incentives can create pressure that eventually breaks limits that were
applied for rational reasons.

Currently a single uncompressed frame of headers already far exceed the
resource commitment most servers are prepared to make to a HTTP request, as
you have to hold those headers for the full duration of request handling
(even when waiting for a slow DB, we can put IO buffers back in the pool,
but not headers).     The numbers were published here recently an 16KB
headers would cover around 99.9% of traffic.  However there is apparently
0.1% of traffic that does use larger headers so apparently according to our
charter we must support them.       The moment you allow headers to be in
more than 1 frame, then the single shared state table becomes the limiting
factor for applying flow control etc.


>>However, it is nigh impossible to flow control headers when we have
decided
>>to have a single state table for [...]

>I really don't see why you can't just count the frames containing
>the headers towards the windows, compressed or not compressed ?

Because of the single shared hpack state table.  You don't know how large
your header frame(s) are until you start compressing them.  Once you have
started compressing them, you have started mutating the shared hpack state,
so no other headers can be compressed.  So you have to be able to finish
compressing them and transmit them so other streams can access the shared
hpack state.   If you flow control headers, then you might not be able to
send them, so you will have effectively locked the entire connection and
not just one stream.

It all comes back to hpack.    If we want simple framing that is not type
dependent and universal flow control, then we need to go to come up with a
way to avoid every stream mutating a shared state table.



> >It is indeed non obvious.  But I have tried many times to come up with a
> >way of removing them and I have concluded that they are a necessary
> >consequence of other design decisions - specifically the single shared
> >state table of header encoding.
>
> But what difference does it make that you send:
>
>         HEADERS
>         CONTINUATION
>         CONTINUATION + END_HEADERS
>
> vs.
>
>         HEADERS
>         HEADERS
>         HEADERS + END_HEADERS
>
> I don't see any value added from having two different frame types ?
>


I basically agree with you, and made pretty much the same comments when I
tuned in here a month ago.  But let try to explain some of the reasoning
for this design decision as I have understood it.

CONTINUATION is used for both HEADERS and PUSH_PROMISES.  Both HEADERS and
PUSH_PROMISE have extra fields that do not make sense in repeated instances
of themselves. EG PUSH_PROMISE has a promised stream ID and HEADERS can
have priority fields.   If we went for HEADERS+ and PUSH_PROMISE+, then
we'd have to make using these fields in continuations illegal, which is
just as fiddly as having a dedicated continuation frame.

Besides, the thing I don't like about CONTINUATION is not that it exists,
but that it does not carry the END_STREAM bit, and I think that all frames
should.   I believe a stream should exist from the first time it's ID is
mentioned until it sees a frame with the end stream bit set - and that
logic should not have to consider frame types.   The reason it does not
carry the END_STREAM bit is that a PUSH_PROMISE and it's continuation
cannot legally end a stream, so they do not have the END_STREAM bit.  The
solution to that is of course to have END_STREAM bit on all frames and if
somebody sends an illegal sequence of frame types, then that is a protocol
error (and removing the END_STREAM bit from some frames does not avoid
illegal sequences of frame types).

But because we allow HEADERS in the middle of a data stream (and the frame
be called META_DATA as it is crazy to have trailers carried in a header),
and this mid stream meta data does not half close the stream, we need to
know if a headers is mid stream or is a trailer that is half closing the
stream.  Thus we would end up with a I_AM_GOING_TO_SEND_END_STREAM_SOON
bit.    The reason for this is because we cannot interleave other frames
between HEADERS and CONTINUATIONS because of the single shared state table
of hpack.  Currently the state machine is vaguely like the simple one in
the draft because of the simplification that we can transition to half
closed once a HEADERS with END_STREAM bit set is received/sent.     Without
that simplifying assumption the real state machine is closer to the one I
proposed here  https://github.com/http2/http2-spec/issues/484
(as it is, I think the 5 paragraphs of text explaining the Closed state
indicate the draft is an over simplication and as if the TCP state machine
left TIME_WAIT and CLOSE_WAIT as exercises for the reader )

I tried several times to come up with a proposal to remove CONTINUATIONs
and/or to have EOS on all frames.  I could not find a reasonable
simplification that worked without needing to throw out the concept of a
shared compression state table.

cheers


-- 
Greg Wilkins <gregw@intalio.com>
http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that scales
http://www.webtide.com  advice and support for jetty and cometd.
Received on Wednesday, 18 June 2014 07:40:04 UTC