- From: Thibault Meunier <ot-ietf@thibault.uk>
- Date: Fri, 19 Sep 2025 15:57:49 +0000
- To: Carsten Bormann <cabo@tzi.org>
- Cc: Working Group HTTP <ietf-http-wg@w3.org>
Thanks for the response. I'm still confused. As quoted below, RFC 9421 Section 2.2.6 states ``` Path components are represented by their values before decoding any percent-encoded octets, as described in the simple string comparison rules provided in Section 6.2.1 of [URI] ``` As I understand, as an implementer, if my library receives a URL with the percent-encoded path `/%7Esmith`, I should keep it unmodified. Let's take the example below ``` normalise_url_rfc9421(url: string) -> string ``` If the path is `/%7Esmith`, what is the normalised output? 1. "/~smith" 2. "/%7Esmith" With RFC 9421 rule, I'd say 2 (before decoding any percent-encoded value). But with your response, which aligns with RFC 9110 Section 4.2.3, I'm tempted to say 1. Thanks again, and apologies if this is redundant, Thibault On Friday, September 19th, 2025 at 4:59 PM, Carsten Bormann <cabo@tzi.org> wrote: > > > On Sep 19, 2025, at 10:11, Thibault Meunier ot-ietf@thibault.uk wrote: > > > The two statements appear to be in conflict. > > > I don’t think they are, see below. > > > I'm not sure which one applies with regard to percent-encoded octets. > > > > Let's take the example from [HTTP] section 4.2.3 `/%7Esmith` > > > > Which of the following should be provided when used as "@path": > > > > 1. "/~smith" > > 2. "/%7Esmith" > > > > Most implementations seems to use 2. > > > Please see RFC 3986, 2.2. > To save some busywork, RFC 3986, 2.3 says: > > unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" > > So ~ is definitely among the characters that normalize to themselves, i.e., it is normalized to ~, not %7E. > > Note that RFC 3986 2.4 para 2 also uses ~ as an example. > > Grüße, Carsten
Received on Friday, 19 September 2025 15:58:00 UTC