W3C home > Mailing lists > Public > ietf-http-wg@w3.org > July to September 2021

Re: Subtle incompatibility between H2 and H1's :path

From: Martin Thomson <mt@lowentropy.net>
Date: Thu, 19 Aug 2021 22:58:58 +1000
Message-Id: <6ae99a76-8390-42e8-b3ed-bfe5097632b4@www.fastmail.com>
To: ietf-http-wg@w3.org
See https://github.com/httpwg/http2-spec/pull/910 for the fix.  (Meaning that I agree with your analysis.)

On Thu, Aug 19, 2021, at 15:59, Willy Tarreau wrote:
> Hello,
> 
> after tightening up the :path parser in haproxy to strictly comply with
> both RFC7540 and the latest draft, one user of a large hosting platform
> reported breakage of at least one hosted site which contains a few HTML
> links with the path beginning with two slashes, resulting from the
> concatenation of a base URL ending with a slash and a prefix. E.g:
> 
>     <img src="https://site.example.org//static/image.jpg">
> 
> At first I responded "that's expected as it is explicitly forbidden by
> the H2 spec (RFC7540), which says":
> 
>      "The ":path" pseudo-header field includes the path and query parts
>       of the target URI (the "path-absolute" production and optionally a
>       '?' character followed by the "query" production (see Sections 3.3
>       and 3.4 of [RFC3986])."
> 
>    And RFC3986#3.3:
> 
>       path-absolute   ; begins with "/" but not "//"
>       path-absolute = "/" [ segment-nz *( "/" segment ) ]
>       segment-nz    = 1*pchar
>       segment       = *pchar
> 
> Then I wondered why before this change the request was processed by the
> HTTP/1.1 backend server, had it been too lenient or was there a difference
> in the protocol spec. The response is the latter. In RFC7230 #2.7, a
> purposely different absolute-path is defined:
> 
>   An "absolute-path" rule is defined for protocol elements that can
>   contain a non-empty path component.  (This rule differs slightly from
>   the path-abempty rule of RFC 3986, which allows for an empty path to
>   be used in references, and path-absolute rule, which does not allow
>   paths that begin with "//".)
> 
>      request-line   = method SP request-target SP HTTP-version CRLF
>      request-target = origin-form
>                     / absolute-form
>                     / authority-form
>                     / asterisk-form
> 
>      origin-form    = absolute-path [ "?" query ]
>      absolute-path = 1*( "/" segment )
> 
> And this version is the one that was adopted by the HTTP core spec, but
> the H2 spec keeps its difference with path-absolute that cannot start
> with "//", even in the latest draft.
> 
> This use of "path-absolute" was introduced into the H2 spec between draft
> 04 and draft 05 when trying to precise the definition of :path. And I think
> that by then the difference between HTTP/1 and RFC3986's interpretation of
> path-absolute and absolute-path has simply been overlooked.
> 
> Given that in the report above the browsers happily sent the request using
> the HTTP definition of absolute-path and not RFC3986's definition of
> path-absolute (thus violating RFC7540), that sites *are* written to rely
> on this, that this seems to be how other H2 implementations are currently
> handling it, and that the new HTTP spec defines the format of a request-target
> in origin form as an absolute-path as well, I think we should fix the latest
> H2 draft to adopt the common definition of absolute-path (which explicitly
> permits "//") and stop keeping a non-interoperable exception here.
> 
> Does anyone disagree ?
> 
> Thanks,
> Willy
> 
> 
Received on Thursday, 19 August 2021 12:59:48 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 19 August 2021 12:59:50 UTC