- From: Ryan Sleevi <ryan-ietf@sleevi.com>
- Date: Wed, 3 Feb 2021 14:00:23 -0500
- To: Willy Tarreau <w@1wt.eu>
- Cc: Ryan Sleevi <ryan-ietf@sleevi.com>, Martin Thomson <mt@lowentropy.net>, Poul-Henning Kamp <phk@phk.freebsd.dk>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
- Message-ID: <CAErg=HHYGYzWS6K-LS1i+An9B=_BP-PjCTmOS-7kdeXRPWvOzg@mail.gmail.com>
On Wed, Feb 3, 2021 at 1:17 PM Willy Tarreau <w@1wt.eu> wrote: > > Do you have other examples? > > Examples of what, trailers in real use ? I seldom see them in user reports. > <snip> Yes, although this matches my experiences as well. Granted, we see a lot more poorly written intermediaries (which is the longer way of spelling antivirus), and so perhaps it's not surprising that trailers aren't as widely used when transiting multiple networks/administrative zones. <snip> > > I do have to agree with PHK here: this sort of merging is a state machine > > security nightmare, especially when thinking about the interaction with > > resources, caching, and the overall semantics of HTTP. > > For me, the main ennemy of trailers is the fact that they were considered > as the part of the same namespace as headers, which is what causes this > merging nightmare. But just like some fields are forbidden in trailers, we > could state that they ought to be ignored in headers (in case of merging > by an intermediary), and that's why I really do not want to see both use > the same name. Once you stick to this, there's no merging nightmare nor > security issue anymore: if the trailer is present, you know what to cache. > If it's absent (and being in the header is counted as absent), you don't > cache, period. It's only suboptimal. But suboptimality is what encourages > improvements in products. Breakage encourages fragmentation. > I'm not sure this is entirely correct, in the example you describe below (and more response there). That is, I think you've described a scenario where they're still part of the same logical namespace, in practice, even if they're not meant to be. <snip> > > but I > > wouldn't be terribly excited for trailers precisely because it > necessitates > > careful re-review of every state machine, end to end, to make sure new > > issues and surprises aren't introduced by such semantics. So even if, in > > the abstract, it's good and useful, that sort of complexity may preclude > > implementation. > > What class of issue would you envision with a field which only has > semantics > in trailers and which must be ignored in headers ? I mean, say the server > emits this: > > HTTP/1.1 200 OK > Transfer-encoding: chunked > Cache-control: no-cache; trailers > > b > 0123456789 > 0 > Cache-Post-Body-Status: public; max-age=86400 > > It could be relayed as-is by compliant intermediaries. It could be relayed > like this by those compliant as well but which merge trailers and headers: > > HTTP/1.1 200 OK > Transfer-encoding: chunked > Cache-control: no-cache; trailers > Cache-Post-Body-Status: public; max-age=86400 > > b > 0123456789 > 0 > > In this case the response is not cached. Transfer-encoding could even be > translated to content-length by the way, the principle remains. The trailer > could also be silently dropped on the path: > > HTTP/1.1 200 OK > Transfer-encoding: chunked > Cache-control: no-cache; trailers > > b > 0123456789 > 0 > > It wouldn't be cached either. > > Then there are those which choke on trailers because they stop after 0 > CRLF, > they wouldn't cache either. What I like with this approach is that a > degraded > message cannot be restored later in the chain to become correct again. This > wouldn't be the case if using the same field name in both parts. > From an implementation state machine complexity, this ends up still being messy. For example, in Chrome at least, we use a multi-process architecture such that there is a network process, a browser process, and a renderer process. The renderer process requests a resource from the network process, and that result is then sent back via IPC through a fixed-size circular buffer. Using your example, the header says "don't cache", the trailer says "do cache". Currently, Chrome's implementation uses the header-defined value to determine whether it needs to "tee" (in the *nix sense) the response to both the disk cache and the renderer process, or whether to send it straight through to the renderer process. In order to ensure backpressure is properly handled, if the disk IO of the cache activity slows, it naturally slows the transmission rate to the renderer process. In effect, we only have a fixed amount of memory in use by a resource at a time. Using your example, the need to cache after the fact in the trailer wouldn't be possible, not without buffering the entire response either in memory or through some temporary file, in order to know what to do with the response. The same challenge applies in the inverse (where the header says cache and the trailer says don't cache), which would have us invalidating a cache entry even though it wasn't necessary or appropriate for the semantics. That's why I said above that it sounds like they're still in the same namespace, in as much as it allows trailers to affect the semantics of headers, and vice-versa, even if their naming isn't identical. This sort of problem equally exists with the "body first, headers-as-trailers" suggestion. These aren't just "correctness" issues, but we view them as security relevant, because they can allow for things like denial of service (via resource exhaustion), or interesting attacks in cross-origin timing scenarios (through probing of any dynamic limits used to mitigate the exhaustion issue). From a server security standpoint, a number of services draw security boundaries between "headers" (which should be trusted/controlled by server admin) and "bodies" (which can be controlled by hosted code/untrusted parties), and the introduction of "trailers" and any semantics would have to try to preserve/respect those assumptions. That is, trailers that redefine the semantics of headers, even if separately named, would and could be abused, which equally makes trailer bits unexciting. I agree it's better for intermediates to use the semantics you described. That feels similar to the set of mitigations adopted by HSTS/HPKP - namely, that they only accepted the 'first' occurrence of a header and relied on the headers not being mergeable, in order to try to preserve the separation I mentioned above. But it doesn't feel terribly exciting from an implementation perspective, and feels like it could easily lead to new classes of security bugs if generally available. This isn't so much a "hard oppose" to the use case; I think cases like gRPC show that "novel" uses of the HTTP/2 framing can and do exist, and at a protocol level, these can enable new and interesting cases. Even the Fastly example makes general sense for why, and I'm not trying to dismiss that. However, for the general resource semantics and loading, it feels like there is the possibility of years of security surprises in store, and so it's not a particularly exciting or likely thing to implement in a general-purpose client that tries to be security-opinionated.
Received on Wednesday, 3 February 2021 19:00:51 UTC