Re: Structured Headers: length limits from Willy Tarreau on 2018-05-15 (ietf-http-wg@w3.org from April to June 2018)

From: Willy Tarreau <w@1wt.eu>
Date: Tue, 15 May 2018 06:42:33 +0200
To: Martin Thomson <martin.thomson@gmail.com>
Cc: Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20180515044233.GB23733@1wt.eu>
On Tue, May 15, 2018 at 08:19:49AM +1000, Martin Thomson wrote:
> This seems like an improvement.
> 
> I think that the underlying concern is that the amount of memory a
> recipient has to allocate is essentially unbounded by the underlying
> protocol.  In practice however, limits are much tighter.

Definitely!

> Looking at the draft, you have changed in the following ways:
> 
> 1. to note that there is (probably) a limit on the encoded size of header
> fields, but not to say what it is.  Here it would be good to establish some
> expectations about a size that is "guaranteed" to work.  That might be
> unrealistic, but it would be consistent with your other
> statements/recommendations regarding minimums.

In fact to guarantee to work we need to specify more values. For example,
haproxy by default can receive 15 kB of headers (16 kB buffer size minus
1 kB reserved to add extra fields). In practice, I normally set it to
7+1 kB and never face any issue except in places where bogus cookies are
added in loops. So these 7 kB are both a limit on the total header size,
and for each header field, including the request/status line. I remember
Apache 1.3 (didn't check on newer) used to support 100 lines of headers
plus one request/status line, each line up to 8 kB. So here already we
see a very different model : by default Apache supports shorter header
fields but the sum can be larger. If we take the intersection between
the two, we end up at ~8 kB total and ~8 kB per header field (which is
why I set haproxy to 8kB since Apache is everywhere and more or less
defines what must be supported).

I think that in order to set realistic limits we need to take in
consideration the allocated resources on the recipient because ultimately
this is what drives implementations to set hard limits. I really like what
was done for HPACK, consisting in counting the size adding some overhead
for internal uses and counting the total size. The simple fact that it is
not easy to determine on the sender is a good thing, because it encourages
ignorant senders to stay away from the limit. For example if we say that
no single header line may be longer than 8 kB and the total header size
cannot be longer than 16 kB, including a few bytes per line and a few
extra bytes total, it will become clear that it's pointless to try to
serialize exactly 8kB on a single line.

Also, haproxy sets a limit on the header field name's length. I don't
remember what it is, probably something around 63 or 255 characters,
depending on the number of bits used for this in the index. In practice
this limit was never met, but it definitely is an implementation specific
limit. I remember seeing some CGIs fail a long time ago because header
fields were turned into environment variables whose name was too long
for example!

> 2. to establish minimums for the size of structures.  This is a fine
> addition, because the addition of structure to a header field can add to
> the memory commitment a recipient has to make.  Establishing an expectation
> that strings can be 1024 octets is good, for example.

In fact, one thing that made HTTP succeed is that everyone has its own
limits depending on the local requirements. Maybe we could define a few
"profiles" with a set of reasonable values. HTTP messages observed in
enterprise environments are much larger than the ones you see at home,
with many large cookies and authentication tokens for example. Those
you see from your browser when connecting to the net are much larger
than the ones used my scripts, bots or IoT devices, where storing 1kB
for a string might already appear huge. Maybe 3 profiles would be
enough : application, internet, iot.

Just my two cents,
Willy
Received on Tuesday, 15 May 2018 04:43:03 UTC