- From: Stefan Eissing <stefan.eissing@greenbytes.de>
- Date: Thu, 19 Aug 2021 12:09:17 +0200
- To: HTTP Working Group <ietf-http-wg@w3.org>
- Message-Id: <AF47963F-0B13-4FA5-8AC4-B761AAAD630A@greenbytes.de>
Missed the shift key in my reply, sry. > Anfang der weitergeleiteten Nachricht: > > Von: Stefan Eissing <stefan.eissing@greenbytes.de> > Betreff: Aw: Subtle incompatibility between H2 and H1's :path > Datum: 19. August 2021 um 11:20:36 MESZ > An: Willy Tarreau <w@1wt.eu> > > Thanks for writing this up in such a nice way. > > I agree that we should have a common, interoperable definition. Divergent > handling in implementations can and has resulted in vulnerabilities. The main > reason being that existing code for h1 protocol semantics was not also used > for h2 semantics. While reasons for that were mainly outside the scope of > standards, different definitions in h1 and h2 do not help. > > - Stefan > >> Am 19.08.2021 um 07:59 schrieb Willy Tarreau <w@1wt.eu>: >> >> Hello, >> >> after tightening up the :path parser in haproxy to strictly comply with >> both RFC7540 and the latest draft, one user of a large hosting platform >> reported breakage of at least one hosted site which contains a few HTML >> links with the path beginning with two slashes, resulting from the >> concatenation of a base URL ending with a slash and a prefix. E.g: >> >> <img src="https://site.example.org//static/image.jpg"> >> >> At first I responded "that's expected as it is explicitly forbidden by >> the H2 spec (RFC7540), which says": >> >> "The ":path" pseudo-header field includes the path and query parts >> of the target URI (the "path-absolute" production and optionally a >> '?' character followed by the "query" production (see Sections 3.3 >> and 3.4 of [RFC3986])." >> >> And RFC3986#3.3: >> >> path-absolute ; begins with "/" but not "//" >> path-absolute = "/" [ segment-nz *( "/" segment ) ] >> segment-nz = 1*pchar >> segment = *pchar >> >> Then I wondered why before this change the request was processed by the >> HTTP/1.1 backend server, had it been too lenient or was there a difference >> in the protocol spec. The response is the latter. In RFC7230 #2.7, a >> purposely different absolute-path is defined: >> >> An "absolute-path" rule is defined for protocol elements that can >> contain a non-empty path component. (This rule differs slightly from >> the path-abempty rule of RFC 3986, which allows for an empty path to >> be used in references, and path-absolute rule, which does not allow >> paths that begin with "//".) >> >> request-line = method SP request-target SP HTTP-version CRLF >> request-target = origin-form >> / absolute-form >> / authority-form >> / asterisk-form >> >> origin-form = absolute-path [ "?" query ] >> absolute-path = 1*( "/" segment ) >> >> And this version is the one that was adopted by the HTTP core spec, but >> the H2 spec keeps its difference with path-absolute that cannot start >> with "//", even in the latest draft. >> >> This use of "path-absolute" was introduced into the H2 spec between draft >> 04 and draft 05 when trying to precise the definition of :path. And I think >> that by then the difference between HTTP/1 and RFC3986's interpretation of >> path-absolute and absolute-path has simply been overlooked. >> >> Given that in the report above the browsers happily sent the request using >> the HTTP definition of absolute-path and not RFC3986's definition of >> path-absolute (thus violating RFC7540), that sites *are* written to rely >> on this, that this seems to be how other H2 implementations are currently >> handling it, and that the new HTTP spec defines the format of a request-target >> in origin form as an absolute-path as well, I think we should fix the latest >> H2 draft to adopt the common definition of absolute-path (which explicitly >> permits "//") and stop keeping a non-interoperable exception here. >> >> Does anyone disagree ? >> >> Thanks, >> Willy >> >
Received on Thursday, 19 August 2021 10:09:35 UTC