- From: Willy Tarreau <w@1wt.eu>
- Date: Mon, 16 Sep 2024 08:50:18 +0200
- To: ietf-http-wg@w3.org
Hello! Over the last 3 months, we've received two different reports of H2 interoperability issues between haproxy and an origin server, the first one being Jetty and the latter apparently being Apache Traffic Server. (both reported in the same issue below: https://github.com/haproxy/haproxy/issues/2592). The concern was receiving H2 requests in origin form ("lack of an :authority pseudo header field in requests") sent by haproxy to an origin server, in a setup where haproxy is an edge gateway between the internet and the origin server and receives HTTP/1 requests. The scenario was the following one: HTTP/1.1 HTTP/2 client ----------> haproxy ---------> origin When we wrote our H2 implementation, we tried to follow the spec as closely as possible. By then it was RFC7540 which explicitly stated in section 8.1.2.3: | The ":authority" pseudo-header field includes the authority | portion of the target URI ([[RFC3986], Section 3.2]). The authority | MUST NOT include the deprecated "userinfo" subcomponent for "http" | or "https" schemed URIs. | To ensure that the HTTP/1.1 request line can be reproduced | accurately, this pseudo-header field MUST be omitted when | translating from an HTTP/1.1 request that has a request target in | origin or asterisk form (see [[RFC7230], Section 5.3]). Clients | that generate HTTP/2 requests directly SHOULD use the ":authority" | pseudo-header field instead of the Host header field. An | intermediary that converts an HTTP/2 request to HTTP/1.1 MUST | create a Host header field if one is not present in a request by | copying the value of the ":authority" pseudo-header field. Thus it's clearly forbidden to forge a :authority from Host for example, since pseudo headers are meant to reflect the components of the request line (or the status line for responses). This would transform requests in origin form to absolute form. For me, the newer version of the specs (RFC 911x) still contains this, but it might be less obvious since split over several documents, and I suspect that it's possible that the lack of explicit "MUST be omitted" statement like above might be a reason why we're only receiving such reports now: - RFC9113 #8.3.1 says: | The ":authority" pseudo-header field conveys the authority | portion (Section 3.2 of [RFC3986]) of the target URI | (Section 7.1 of [HTTP]). The recipient of an HTTP/2 request | MUST NOT use the Host header field to determine the target URI | if ":authority" is present. At this point we start to see that :authority might be missing. | Clients that generate HTTP/2 requests directly MUST use the | ":authority" pseudo-header field to convey authority | information, unless there is no authority information to convey | (in which case it MUST NOT generate ":authority"). Same here. (...) | An intermediary that forwards a request over HTTP/2 MUST | construct an ":authority" pseudo-header field using the | authority information from the control data of the original | request, unless the original request's target URI does not | contain authority information (in which case it MUST NOT | generate ":authority"). Same here. But below: | Note that the Host header field is not the sole source of | this information; see Section 7.2 of [HTTP]. This one seems to imply that Host might possibly be used to construct :authority, which if true, would contradict RFC7540 above. - RFC9110 #6.2 about control data clearly says: | In HTTP/1.1 ([HTTP/1.1]) and earlier, control data is sent as | the first line of a message. In HTTP/2 ([HTTP/2]) and HTTP/3 | ([HTTP/3]), control data is sent as pseudo-header fields with a | reserved name prefix (e.g., ":authority"). So here there's no ambiguity in my opinion. For someone having known the rule stated in 7540, I think that what's above remains pretty conform. But I think that the note about Host can cause some confusion for those who don't notice RFC9110#6.2 as it would imply that requests in origin form may be turned to absolute form. For example, one of the participants to the discussion in the ATS issue on this topic seems to think such a request is malformed, which to me, seems to indicate that the new wording, even if more general and precise, might be a bit more difficult to grasp: https://github.com/apache/trafficserver/issues/11765#issuecomment-2347015362 We've proposed workarounds for this consisting in rewriting the request line from the Host part, but I wouldn't like to see this generalize. It's ugly and error-prone. Similarly it seems that the projects have also considered implementing an option to accept a request in origin form but this seems a bit convoluted to me in that it requires more efforts from their users, and clearly raises the question about the relevance of origin vs absolute form on the internet nowadays. I'm not sure what can/should be done at this point to limit the risk that this issue becomes more common. In our case, we've gone through great efforts trying to respect 911x as closely as possibly, being able to respect both origin and absolute forms of H1->H2->H1 transformations. This is important for our users because many of them enforce routing or filtering rules applying to the URI for example, hence often expect an origin form in HTTP/1 on some legacy configs. But is there any relevance of origin form anymore beyond HTTP/1, should we all enforce absolute form everywhere by default ? Or if origin form remains necessary (I think so), should we try to improve the wording of the spec to make it clear that it's still permitted ? It's hard for me to propose anything since the wording *is* correct, but probably not as intuitive as the previous one when reading 9113 alone. Maybe it could be sufficient to insert in 9113 a paragraph close to the one from RFC7540 above ? It could look a bit like this: Requests sent in origin form lack :authority and use Host instead. Requests in absolute form use :authority and MAY also have an optional Host header field that MUST match :authority. Clients SHOULD prefer the absolute form. Intermediaries converting HTTP/1.1 requests to HTTP/2 MUST apply the same form as they received. I'm interested in opinions and suggestions on this topic. Thanks! Willy
Received on Monday, 16 September 2024 06:50:24 UTC