Re: [whatwg/url] Unicode normalization could change the structure of a URL (#626)

> Why would it not send back the serialization of what it parsed?

Because that's how OAuth2 works. It seems you are proposing a new standard to replace OAuth2.

> Also, who does the normalization from `#` to `#`? That's not allowed.

Perhaps you assumed that normalization should never happen, but Unicode normalization is necessary to implement the Unicode standard efficiently and correctly. One cannot avoid it. It is just that certain normalization algorithms could mess up URL structures. `#` and `#` are "compatible" according to the Unicode standard and NFKC and NFKD would bring `#` to `#`. (NFD could also have problems, but it is probably safe in this case.)

I am fully aware of W3C's general recommendation that NFC (and perhaps percent-encoding related stuff) should be used, but most people are not aware of the subtle differences between normalization forms and security bugs have been reported for many major frameworks as shown above.

> It's also not clear how this relates to STD3 as that is purely about ASCII code points and doesn't do normalization of non-ASCII code points as far as I know.

"STD3 Rules" and "STD3" are different. I will review and revise my original proposal in case my wording was confusing. STD3 is a standard about ASCII code points. STD3 **Rules** (not STD3 itself) are to prevent dangerous non-ASCII code points which could generate ASCII code points disallowed by the STD3 after certain normalization. Please take a look at https://unicode.org/reports/tr46/#STD3_Rules and also the IDNA mapping table.

> (And you cannot outlaw `https://evil.com#@google.com` as that's a perfectly valid URL string; identical to `https://evil.com/#@google.com`.)

I only proposed to outlaw `https://whatwg.org#@evil.com` on the basis that `#` could be normalized to `#`. I am sorry for the wrong impression that I intended to outlaw both. By "dangerous/problematic" I was only referring to `#` or other characters with similar properties.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/626#issuecomment-892695499

Received on Wednesday, 4 August 2021 14:14:59 UTC