- From: Amos Jeffries <squid3@treenet.co.nz>
- Date: Sat, 20 Sep 2025 10:41:32 +1200
- To: ietf-http-wg@w3.org
On 20/09/25 03:57, Thibault Meunier wrote: > Thanks for the response. > > I'm still confused. As quoted below, RFC 9421 Section 2.2.6 states > > ``` > Path components are represented by their values before decoding any percent-encoded octets, as described in the simple string comparison rules provided in Section 6.2.1 of [URI] > ``` > > As I understand, as an implementer, if my library receives a URL with the percent-encoded path `/%7Esmith`, I should keep it unmodified. > That depends on how you are _using_ the URL path. * If you need to compare it to other URLs received from non-HTTP sources you will need a normalized representation to compare with and against. * If you are translating to a non-HTTP destination, you may need to normalize in order to correctly re-encode for the translation. * If the URL is an HTTP message component you are relaying, then you can choose whether to normalize your output or leave as-received. The recipient should be following these same normalization-before-use rules and cope with both inputs. > Let's take the example below > > ``` > normalise_url_rfc9421(url: string) -> string > ``` > > If the path is `/%7Esmith`, what is the normalised output? > > 1. "/~smith" > 2. "/%7Esmith" > > With RFC 9421 rule, I'd say 2 (before decoding any percent-encoded value). But with your response, which aligns with RFC 9110 Section 4.2.3, I'm tempted to say 1. Yes, Normalized form is *after* decoding the unnecessarily encoded characters. So (1) is normalized. (2) is just one of many representations. These is also things like: 3. "/~%73mi%74h which normalizes to (1) and so (1) == (2) == (3) when compared. Cheers AYJ
Received on Friday, 19 September 2025 22:41:39 UTC