Re: [whatwg/url] Unicode normalization could change the structure of a URL (#626)

> I have to say that I don't see the problem here. `#` is not `#`. Are you suggesting that someone who looks at a URL, but does not use a parser to obtain the host might be misled? Because that's not a thing we try to protect against.

No, a standard-compliant parser is always used before and after the Unicode normalization. It's not (just) someone checking the URLs using their human eyes and feeling confused---one can already craft long, heavily percent-encoded URLs to baffle any human being. The security issue here is that things can still go wrong **even for standard-compliant parsers.**

I did not elaborate on the actual attacks because I intended to just post a summary here. Let me elaborate on one attack to OAuth2 by exploiting this. In [CVE-2019-0654](https://msrc.microsoft.com/update-guide/en-US/vulnerability/CVE-2019-0654) for example, Microsoft IE/Edge dangerously normalized the content of the HTTP `Location` header (used in a typical OAuth2 usage to redirect users back to the requesting website), and thus you could use a URL to steal the OAuth2 token. More specifically, when the authorization server sees
```
https://evil.com#@google.com

```
It would think the request is from Google and maybe approve it because of that. But when the redirect URL (with the token appended) is sent back to the user using the older versions of Microsoft IE/Edge, it will be normalized to
```
https://evil.com#@google.com?code=xxxxx&session_state=...

```
and thus the OAuth2 token `xxxxx`---the most important part of OAuth2---will be sent to `evil.com` instead of `google.com`. At this point, the attack has succeeded.

My main point is that this is more serious than an inconvenience. Here are a few other similar CVEs already listed in the HostSplit presentation:
- [CVE-2019-0657](https://msrc.microsoft.com/update-guide/vulnerability/CVE-2019-0657): .NET and Visual Studio
- [CVE-2019-9636](https://bugs.python.org/issue36216) and [CVE-2019-10160](https://bugs.python.org/issue36742): Python
- [CVE-2019-2816](https://nvd.nist.gov/vuln/detail/CVE-2019-2816): Java

I am not a security expert so there might be more creative ways to exploit it. Given how widespread URLs are in various standards, some directly maintained by WHATWG, I believe this is a serious concern the working group should address. I personally prefer forbidding these characters, but if the working group decides to continue allowing these dangerous characters in URLs, I urge the working group to at least include a vivid security warning in the standard.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/626#issuecomment-892641568

Received on Wednesday, 4 August 2021 13:06:49 UTC