Re: How big can the protocol elements in HTTP be?

Hi Martin,

On Fri, Oct 25, 2024 at 09:44:40PM +1100, Martin Thomson wrote:
> I'm in the process of writing yet another implementation of RFC 9292 (this
> time with incremental decoding) and I realized that in this setting I need to
> start caring about size limits in a more granular fashion.  It's one thing to
> parse a block of memory that you are given, but it's a different matter if
> you are dealing with streams of indeterminate length.

It's as usual unfortunately, many of us ended up with tunables to support
extending the limits beyond what's commonly found, just for rare special
cases.

> We recommend that URLs of at least 8000 characters are accepted [1], so it
> might be reasonable to limit that to 8k.  But then, in HTTP/2 and HTTP/3 and
> RFC 9292, the URL is split.
> 
> How big can a URI scheme be?  RFC 3986 doesn't say.  8k for that seems a bit
> much.  The registry has some long ones though, so maybe I could be
> conservative and say 256 bytes.

I'd argue that individual components made of a single word should be
writable on their own line in the spec. If we can't write a method or
a scheme on a line of 72 characters or so, then for sure there's a
problem waiting to happen.

> How big can an authority be?  I know that hostnames aren't domain names, but
> maybe I don't care about other types of name.  Domain names can't be more
> than 255 bytes and a port number can always fit in 5 bytes, so is the limit
> 261?  Or does the prospect of having to carry IDN lead to a need for more
> than that in corner cases?

I think it can depend on the protocol and address family. If we consider
that HTTP is standardized over TCP & QUIC on IPv4 and IPv6, then we're
seeing host names which are larger than the largest numeric address so
they're fixing the limit. Unix sockets are used quite a bit though they
generally don't appear in URIs hence authorities, and can be of up to
108 or 255 chars, so that's similar. Maybe other address families exist
between applications, I don't know.

> The path seems like an easy choice: 8k.
> 
> How big can a method be?  RFC 9110 doesn't say.  Even 32 bytes seems generous
> when the registry never hits 20, but is that really what we want?

cf above.

> Should minimums for these be standardized, or are we all comfortable with rolling the dice?

I've long been annoyed by the lack of minimums but I'd say that things
have settled down quite cleanly over the years. When haproxy had a 4k
buffer to get a whole request, it was sufficient except in a very rare
few cases. Then it was increased to 8kB because some users started to
have large cookies. Then we raised the minimum to 16kB for H2. But even
at 8kB for the whole request (method+uri+version+headers) it was
extremely rare to see a user change the size (those dealing with
kerberos tickets or bogus cookies that were adding to themselves on
each round trip for example).

Also forcing a minimum will never work fine with IoT and scripts. I've
seen HTTP servers running in super small devices, that's not pretty...
I think the current status quo consisting on looking at what the
neighbor is doing and applying the same tends to work reasonably fine
in the end, and allows everyone to adjust independently of others so
that there's no "big bang" the day we need to raise a limit again.

Just my two cents,
Willy

Received on Friday, 25 October 2024 14:38:21 UTC