- From: Glenn Strauss <gs-lists-ietf-http-wg@gluelogic.com>
- Date: Sun, 24 Sep 2023 23:33:03 -0400
- To: Martin Thomson <mt@lowentropy.net>
- Cc: Lucas Pardue <lucaspardue.24.7@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
On Mon, Sep 25, 2023 at 12:30:08PM +1000, Martin Thomson wrote: > On Mon, Sep 25, 2023, at 11:25, Lucas Pardue wrote: > > For the content-length matter I mentioned above, my implementation > > generates a 4xx in H2 and H3. I don't plan to change that. > > Content-Length has always been a bit of an awkward fit in the specification. Maybe you are right to say that you want to signal that error at the HTTP layer, but would you say the same if you received a SETTINGS frame on a request stream or one of the many other errors specified in the RFC? > > It would be best if these sorts of divergences from what we specified were documented in the specification, don't you think? I am all for clarification. However, I ask that we please avoid overspecification of implementation unless there is both really good reason and confidence that the specified implementation is the one-right-implementation. If not, and there are security concerns with possible implementer mistakes, the RFC should instead call those out in Security Considerations to highlight concerns for implementers. Speaking as an implementer of the specification, I was able to reuse much of lighttpd's HTTP header parsing and security policy code for both HTTP/1.x and HTTP/2. Once end of headers is read in HTTP/1.1 ("\r\n\r\n") and once HEADERS (and any CONTINUATION frames) have been received for HTTP/2, much of the lighttpd request header parsing code is reused to parse headers into internal data structures. For HTTP/1.1, the request headers string is broken on "\r\n". For HTTP/2, each element of HEADERS is HPACK-decoded. Then, for each header, lighttpd applies a number of configurable security policies. For HTTP/2, there are additional checks for required pseudo-headers, that pseudo-headers precede non-pseudo-headers, and that header field names are all lowercase. For any failures during initial HTTP request header parsing, lighttpd issues an HTTP error code, often HTTP error code 400 Bad Request. My understanding is that I could treat all HTTP request header errors received as HTTP/2 requests as h2 stream error PROTOCOL_ERROR. If that is what is desired by the RFC authors, please issue an update or errata and I'll change that code in lighttpd. However, I currently am unable to understand why h2 PROTOCOL_ERROR (or H3_GENERAL_PROTOCOL_ERROR) is somehow better than 400 Bad Request. Here is another reason why I think an HTTP error code may be preferable: An HTTP/1.0 client might make a request to a proxy which makes an HTTP/2 or HTTP/3 request. 400 Bad Request is an application level code which should be transmitted all the way back to the client. An h2 stream error PROTOCOL_ERROR sent to an intermediary might not make it back to the HTTP/1.0 client in a form that conveys the error clearly to the end-user. (One may argue that the intermediary should have detected the malformed request, but the origin server might implement a stricter security policy and is permitted to return PROTOCOL_ERROR.) I can see where h2 specification around pseudo-headers is specific to the h2 protocol. However, an implementation "could" HPACK-decode the entire HEADERS frame into a single string that looks almost identical to HTTP/1.1 request headers with the exception of the addition of pseudo-headers. It could then be loosely parsed as HTTP/1.x request headers. An implementation might need to fully parse the HEADERS frame before being able to determine that a required pseudo-header is missing. If it has already gotten this far parsing the HTTP request, why should the RFC disallow the implementation from returning an HTTP error code? Aside: I fully agree that if an intermediary is going to rewrite an h2 request to HTTP/1.x, then it should more strictly validate the h2 protocol requirements before rewriting the request to HTTP/1.x or else there might be ways to slip unexpected characters through the protocol translation. Cheers, Glenn
Received on Monday, 25 September 2023 03:33:21 UTC