Re: Straw Poll: Restore Header Table and Static Table Indices from Adrian Cole on 2014-10-15 (ietf-http-wg@w3.org from October to December 2014)

From: Adrian Cole <adrian.f.cole@gmail.com>
Date: Tue, 14 Oct 2014 21:22:21 -0700
To: Roberto Peon <grmocg@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>, Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>, Jeff Pinner <jpinner@twitter.com>
Message-ID: <CAHzwyDuAeMJe_BW0kZkLUHRn6xAN8LO_uno_ZL0TmCLgSaYbkg@mail.gmail.com>
I suppose I'm pretty biased in-so-far as most work I've done in the
last several years has been on http apis, not regular browser stuff.

For example, a cloud portability project with about 3 dozen api
deviants, almost all of which had several custom headers and often a
prefixed encoding scheme for user metadata.. maybe a Link header for
pagination, etc.

It could be possible to make some sort of story collector for RPC
apis, but it would be much more specific than browser traffic.  Ex.
PUT blob with user metadata, read them back with HEAD, etc. or 1000x
POST /servers with a shell script encoded in a header.  or turn on
debug logging on your android app, run a scenario that now has 7 extra
headers, etc.

Internal RPC use of headers is even less likely to be publicly
inventory-able. Ex. turn on request tracing and watch these two
intermediaries add side-channel headers.

I don't many would doubt this stuff exists and is commonplace in apis.
That doesn't change the fact that these scenarios aren't as easy to
inventory and create a test database out of, which we could use for
performance simulations.

IOTW, if we needed a test database to justify flipping back the
indexing, then we might as well call it game over and move on.

If we agree that api stuff does in fact use a lot of non-standard
headers, and we believe flipping back doesn't maim normal web, could
that be a strong enough case to revert?

As many have pointed out, this part of the implementation is easy, but
I can see how folks would be apprehensive about even doing a sensible
change due to reasons Mark clearly summarized.
-A



On Tue, Oct 14, 2014 at 10:17 AM, Roberto Peon <grmocg@gmail.com> wrote:
> Sorry for the slow response; been a little sick over here. At least my voice
> sounds cool, not that most of you will have heard it.
>
> My concern comes from an assumption that we'll use HTTP/2 for more than just
> browsing, i.e. that it will see major use for non-web RPC protocols both
> internal to networks, and between networks (i.e. RPCs to/from "the cloud"
> and whatever it evolves to in the future).
> The current compression scheme can end up having twice the amount of bloat
> in terms of headers per request, even when these headers are used with much
> greater frequency than the current set of static headers.
> One byte more overhead is significant when the amount of overhead was
> typically one byte to begin with, and not with huge changes in complexity.
>
>
> On complexity:
> The removal of the reference set removed the lion's share of complexity in
> the previous set of changes.
>
> The changing of the indexing, however, did not remove significant
> complexity.
> The total change in complexity resulting from the change of indexing can be
> summarized as:
>
> Previously:
>   References to the static set are:
>     dynamic_set_len + static_set_offset
>   References to the dynamic set are:
>     dynamic_set_offset
>
> And with the spec as it stands now:
>   References to the static set are:
>       static_set_offset
>   References to the dynamic set are:
>       static_set_len + dynamic_set_offset
>
> So, at worst, we're talking about adding an int instead of adding a const.
> The worst possible impact of this means that one cannot bit-blit a large
> number of static headers-- one must add them one at a time or fix them up.
> I'll bet that I can show that this makes almost zero difference in CPU when
> implemented properly (there is little magical about a bit-blit to begin
> with)-- I'd be shocked if we couldn't do 100s of millions of header-sets per
> second on a single core.
>
>
> With the spec as it stands today w.r.t. indexing (which, again, was not a
> change that we'd agreed on doing), we lose out in flexibility in the long
> term.
> "New" headers become significantly disadvantaged over the ones currently in
> there, and are very likely to occur in a number of non-web deployments.
> Adding new headers into the static set in the future with the current scheme
> decreases overall dynamic efficiency. It is no longer a low-cost, high
> reward decision.
>
> -=R
>
>
>
> On Sun, Oct 12, 2014 at 6:23 PM, Willy Tarreau <w@1wt.eu> wrote:
>>
>> Hi Mark,
>>
>> On Mon, Oct 13, 2014 at 11:52:43AM +1100, Mark Nottingham wrote:
>> > Roberto,
>> >
>> > So far, we've had a large number of people -1 any change here.
>> > Summarising:
>> >
>> > * It's been asserted that the current approach is "much simpler" to
>> > implement
>> > * The difference is "marginal"
>> > * There's concern about "churn" in the spec and implementations
>> > * There's concern that the proposals are untested
>> >
>> > You say it's "highly suboptimal", but you don't back this up. Indeed,
>> > we've
>> > long established that the WG is not terribly interested in getting the
>> > *most*
>> > efficient compression available -- especially if it's bought with
>> > complexity.
>>
>> I think the complexity that was lost with the change was the copy from
>> static
>> to dynamic. In fact, swapping static and dynamic was done only to get
>> smaller
>> indexes on static that can now be compressed in one byte, possibility that
>> was
>> previously offered to dynamic headers only. What we really need to satisfy
>> all
>> users is to be able to encode *most common* static headers with 1 byte,
>> and
>> a number of dynamic headers with 1 byte as well.
>>
>> > As Mike said once long ago, the important part is that we get *some*
>> > compression, and my perception is that there's wide agreement in the WG
>> > on
>> > that point.
>>
>> I also agree with this.
>>
>> > In the face of that, a one-byte overhead for dynamic headers is hard to
>> > characterise as "highly suboptimal."
>>
>> I wouldn't be as categoric as Roberto here but I understand his point. The
>> previous design allowed headers not belonging to the static table to be
>> correctly compressed (whatever X-* headers appearing inside companies).
>> The new one makes new headers less efficient than the static ones, and I
>> think Roberto sees less possibilities of future improvements with this
>> design.
>>
>> > Other folks have discussed / proposed more elaborate changes than Jeff,
>> > but I
>> > detect very little stomach in the WG for doing so.
>> >
>> > At most, I think we're at a point where the most reasonable thing to do,
>> > *if*
>> > we do anything here, would be to revisit the static table and "make
>> > room" for
>> > more dynamic entries by pruning it some, as per
>> > <https://github.com/http2/http2-spec/issues/587>.
>>
>> It's more or less similar to what I suggested as well to what I proposed
>> except
>> that we don't need to *prune* some entries if the single-byte encoding
>> covers
>> part of both tables.
>>
>> > However, as discussed before, we'd need to see broad support for such a
>> > change; so far, we've held that #587 will only happen if we're making
>> > other
>> > breaking changes.
>>
>> The problem when there are implementations available is that developers
>> know
>> what it means for them to have to modify them : modify both encoders and
>> decoders in multiple products, plan certain outages etc... The problem is
>> that we need to be careful about future users, not early adopters, and all
>> of us have to think about how efficient the protocol will be in 1-10
>> years,
>> not just right now in implementations that will inevitably evolve in the
>> following months.
>>
>> I personally think that the single-byte encoding of *some* dynamic headers
>> has a real value, and even if people are not pleased with revisiting their
>> code, we should do something for it.
>>
>> Best regards,
>> Willy
>>
>
Received on Wednesday, 15 October 2014 04:22:48 UTC