- From: Willy Tarreau <w@1wt.eu>
- Date: Mon, 25 Sep 2023 07:26:05 +0200
- To: Glenn Strauss <gs-lists-ietf-http-wg@gluelogic.com>
- Cc: Martin Thomson <mt@lowentropy.net>, Lucas Pardue <lucaspardue.24.7@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
Hi Glenn, On Sun, Sep 24, 2023 at 11:33:03PM -0400, Glenn Strauss wrote: > I am all for clarification. However, I ask that we please avoid > overspecification of implementation unless there is both really good > reason and confidence that the specified implementation is the > one-right-implementation. If not, and there are security concerns > with possible implementer mistakes, the RFC should instead call those > out in Security Considerations to highlight concerns for implementers. Agreed! > Speaking as an implementer of the specification, I was able to reuse > much of lighttpd's HTTP header parsing and security policy code for > both HTTP/1.x and HTTP/2. It was the same for our very first H2 implementation in haproxy, in fact it basically converted H2 to H1 and all HTTP processing was done there. Only later we implemented in internal representation that also conveys semantics and now H1/H2/H3 differ at a much lower layer. > My understanding is that I could treat all HTTP request header errors > received as HTTP/2 requests as h2 stream error PROTOCOL_ERROR. That's fine, but everyone doesn't always have this option. H2 is special in that it mixes transport and representation of semantics. At the same layer you can have framining issues (e.g. SETTINGS on a stream) and semantics issues (e.g. what if you find an LF character in a :authority header). As previously said, some implementations might be unable to produce a valid HTTP error, because these errors need to be decorated with headers reporting a unique request ID or whatever, and no HTTP request was ever produced there. I would even go as far as suggesting that certain subtle H2 errors should not be allowed to produce HTTP contents. Imagine for example that you receive a DATA frame on a new stream. The spec says that you must respond with a stream error. Given that the stream was never really created (no headers frame to create it), it could possibly be problematic to send a 400 response there. E.g. imagine the following sequence: client server DATA (id=1) -----------> <---------- :status 400 HEADERS (id=1) --------> <---------- :status 200 It might very well be possible that the 400 is taken as the response to the HEADERS frame. Sending RST_STREAM on framing issues like this avoids such problems. However I consider that any error that is detected after HTTP decoding could be eligible for a 400. > If that > is what is desired by the RFC authors, please issue an update or errata > and I'll change that code in lighttpd. > > > However, I currently am unable to understand why h2 PROTOCOL_ERROR > (or H3_GENERAL_PROTOCOL_ERROR) is somehow better than 400 Bad Request. > > Here is another reason why I think an HTTP error code may be preferable: > An HTTP/1.0 client might make a request to a proxy which makes an HTTP/2 > or HTTP/3 request. 400 Bad Request is an application level code which > should be transmitted all the way back to the client. An h2 stream > error PROTOCOL_ERROR sent to an intermediary might not make it back to > the HTTP/1.0 client in a form that conveys the error clearly to the > end-user. (One may argue that the intermediary should have detected > the malformed request, but the origin server might implement a stricter > security policy and is permitted to return PROTOCOL_ERROR.) That's a good point. If the intermediary is certain that it does not produce bad frames, it may consider that in this case the server rejected its request on the grounds of the HTTP contents, and it could very well translate this RST_STREAM(protocol_error) into an HTTP/1.0 400 bad req. > I can see where h2 specification around pseudo-headers is specific to > the h2 protocol. However, an implementation "could" HPACK-decode the > entire HEADERS frame into a single string that looks almost identical > to HTTP/1.1 request headers with the exception of the addition of > pseudo-headers. It could then be loosely parsed as HTTP/1.x request > headers. An implementation might need to fully parse the HEADERS frame > before being able to determine that a required pseudo-header is missing. > If it has already gotten this far parsing the HTTP request, why should > the RFC disallow the implementation from returning an HTTP error code? I generally agree with you and it matches what I mentioned earlier that we should make the effort of reporting at the highest possible level. You'll often find that when you add H2 to an existing H1 server, your H2 stack looks almost like an intermediary there, so you could consider to some extents that it blindly passes what it decoded because it trusts the server for doing the necessary HTTP checks, and the server which is in fact the application code will likely produce 400 more often than the H2 layer will produce RST. But at least framing errors that cannot safely be recovered from (like the example above) should definitely lead to an RST (and sometimes even a connection error). > Aside: I fully agree that if an intermediary is going to rewrite an h2 > request to HTTP/1.x, then it should more strictly validate the h2 > protocol requirements before rewriting the request to HTTP/1.x or else > there might be ways to slip unexpected characters through the protocol > translation. Absolutely, but that's also when you develop intermediaries that you hear "after I deployed your stuff here my application stopped working", and when you look closer, you discover dangerous chars in header fields and stuff like that. Thus the amount of checks has to be... adaptable! Cheers, willy
Received on Monday, 25 September 2023 05:26:29 UTC