Re: HTTP Message Signatures [RFC 9421] percent-encoded URLs normalisation

On 20/09/25 03:57, Thibault Meunier wrote:
> Thanks for the response.
> 
> I'm still confused. As quoted below, RFC 9421 Section 2.2.6 states
> 
> ```
> Path components are represented by their values before decoding any percent-encoded octets, as described in the simple string comparison rules provided in Section 6.2.1 of [URI]
> ```
> 
> As I understand, as an implementer, if my library receives a URL with the percent-encoded path `/%7Esmith`, I should keep it unmodified.
> 

That depends on how you are _using_ the URL path.

  * If you need to compare it to other URLs received from non-HTTP 
sources you will need a normalized representation to compare with and 
against.

  * If you are translating to a non-HTTP destination, you may need to 
normalize in order to correctly re-encode for the translation.

  * If the URL is an HTTP message component you are relaying, then you 
can choose whether to normalize your output or leave as-received. The 
recipient should be following these same normalization-before-use rules 
and cope with both inputs.


> Let's take the example below
> 
> ```
> normalise_url_rfc9421(url: string) -> string
> ```
> 
> If the path is `/%7Esmith`, what is the normalised output?
> 
> 1. "/~smith"
> 2. "/%7Esmith"
> 
> With RFC 9421 rule, I'd say 2 (before decoding any percent-encoded value). But with your response, which aligns with RFC 9110 Section 4.2.3, I'm tempted to say 1.


Yes, Normalized form is *after* decoding the unnecessarily encoded 
characters.

So (1) is normalized. (2) is just one of many representations.

These is also things like:

  3. "/~%73mi%74h

which normalizes to (1) and so (1) == (2) == (3) when compared.


Cheers
AYJ

Received on Friday, 19 September 2025 22:41:39 UTC