Re: [whatwg/url] How should parser handle percent-encoded characters like `%66` U+0066 (f) in path segments? (#565)

+100

It's important to keep in mind that RFC 3986 says that URLs which vary only by percent-encoding should be considered equivalent -- so even if browsers maintain encoding variations in the URL string, it doesn't necessarily mean they are treated differently, or that the new standard does not cause a major, breaking logic change simply because the parser output is the same. Correct me if I'm wrong, but I don't believe the WPT suite includes tests for equality.

Unfortunately, having differently-encoded URLs compare differently makes the new standard almost non-viable in a variety of contexts. It demands that users have intimate awareness of the URL standard and which encode-set they need to use for a given situation (including the fact that the encode-set varies based on the URL's scheme - query vs. special query), and appears to defeat the purpose of `encodeURIComponent` and similar utilities.

I strongly urge that the standard be amended to normalise percent-encoding during parsing. Not just for the path.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/565#issuecomment-805063048

Received on Tuesday, 23 March 2021 16:52:56 UTC