- From: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>
- Date: Wed, 29 Jun 2022 18:20:54 +0900
- To: Willy Tarreau <w@1wt.eu>
- Cc: HTTP <ietf-http-wg@w3.org>
- Message-ID: <CAPyZ6=Kt3rb_qcM9AzvUfj2wSQNmqNo3rOPiCQFbFAyT7sbdeg@mail.gmail.com>
On Wed, Jun 29, 2022 at 2:52 PM Willy Tarreau <w@1wt.eu> wrote: > Hi Tatsuhiro, > > On Wed, Jun 29, 2022 at 08:58:47AM +0900, Tatsuhiro Tsujikawa wrote: > > RFC 7540 even says that :intermediary MUST omit :authority "when > translating > > from an HTTP/1.1 request that has a request target in > > origin or asterisk form (see [RFC7230], Section 5.3)." > > > > Now RFC 9113 has this text: > > > > An intermediary that forwards a request over HTTP/2 MUST construct > > an ":authority" pseudo-header field using the authority > > information from the control data of the original request, unless > > the original request's target URI does not contain authority > > information (in which case it MUST NOT generate ":authority"). > > Note that the Host header field is not the sole source of this > > information; see Section 7.2 of [HTTP]. > > > > This means :authority must be included if the host header field exists in > > an HTTP/1.1 request. > > My understanding is that Host doesn't necessarily count as "control data" > here, and that the goal was to accurately represent an HTTP/1.x request > targetting an HTTP/1.0 server after being transported over HTTP/2. For > example, let's say that a client passes this to a proxy: > > GET http://example.com/ HTTP/1.0 > Proxy-connection: keep-alive > > and nothing more. If instead it gets sent via a gateway that transports > it over H2, it could make sense to consider that the scheme is "http", > the authority is "example.com", that there's no host, hence the request > would be passed as: > > :method: GET > :scheme: http > :authority: example.com > > and that's all. Conversely, let's see the same HTTP/1.0 request sent > directly to the origin server: > > GET / HTTP/1.0 > > There's no more authority nor host, so a gateway receiving that cannot > invent one, unless it uses its own configured name corresponding to its > own address, that it expects the client used to construct the request. > > With HTTP/1.1 there are less ambiguities since Host is mandatory, but > the distinction between "proxy requests" and origin requests is still > relevant, especially when you don't know whether or not the origin > server supports HTTP/1.1 or only 1.0 (and may be confused by the > presence of an authority in the request line). For example, if a > client sends: > > GET / HTTP/1.1 > Host: example.com > > to an HTTP/1.0 server that parses Host, it will work. If it sends > > GET http://example.com/ HTTP/1.1 > Host: example.com > > To an HTTP/1.1 server, it will work as well, but it may fail to an HTTP/1.0 > server (or worse, loop over itself if it supports proxing requests and > resolves itself as example.com). > > If the first request is transported over H2, thus converted from H1 to > H2 then back from H2 to H1, adding an authority that was not initially > present would introduce exactly this problem. By not adding it and using > Host only, the request representation is preserved, and the origin server > can receive the same request that the client took care to encode, and not > be confused. That's why I'm saying that in this case it's clearly visible > that Host isn't part of the "control data" and must not appear in an > authority that was not initially encoded. > > I know it's a bit complicated but we have to deal with history. What we're > doing in haproxy is that both Host and :authority are used interchangeably > after having been checked for proper matching, and are modified at the > same time if needed, and we have a flag indicating if an authority was > present in the incoming request to know if we have to produce one on > output or not. That's in the end what seems to preserve the most accurate > representation along a chain of multiple versions. This allows us to emit > a Host field only if one was present, and an authority only if one was > present, regardless of the HTTP version. I don't think that RFC9113 brings > any changes regarding this, it might only be a matter of what constitutes > "control data". > > Thank you for the explanation. I reread the relevant section of RFC 9113, and you are right that it has not changed on this. Best, Tatsuhiro Tsujikawa > Hoping this helps, > Willy >
Received on Wednesday, 29 June 2022 09:21:19 UTC