Re: [whatwg/url] Should we unescape characters in path? (#606)

It sounds like a good idea to decode them, IMO. The [latest HTTP semantics draft spec](https://httpwg.org/http-core/draft-ietf-httpbis-semantics-latest.html#rfc.section.4.2.3) says:

> Scheme-based normalization (Section 6.2.3 of [RFC3986]) of "http" and "https" URIs involves the following additional rules:
> ...
> Characters other than those in the "reserved" set are equivalent to their percent-encoded octets: the normal form is to not encode them (see Sections 2.1 and 2.2 of [RFC3986]).

We already do the other HTTP-specific normalisations (removing default ports, root path instead of empty, lowercased host name), as well as other normalisations (e.g. exotic IP addresses), so I think it makes sense to do this, too. Some part of the system will have to - best to do it as soon as possible at the URL level to avoid mismatches like those you’ve described.


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/606#issuecomment-845877662

Received on Friday, 21 May 2021 11:15:36 UTC