[whatwg/url] Percent encode NULLs in fragments (#440)

Currently URL parser removes NULL characters from fragments. But is this really needed?
My tests show that NULLs are percent encoded in most browsers (I tested URLs "`https://example.com/#abc\u0000xyz`" and "`non-spec://example.com/#abc\u0000xyz`"):

Browser  | Special URL's hash | Non-special URL's hash
---------|--------------------|----------------------
Chrome   | `#abcxyz`          | `#abc%00xyz`
Edge, IE | `#abc`             | `#abc`
Firefox  | `#abc%00xyz`       | `#abc%00xyz`
Safari   | `#abc%00xyz`       | `#abc%00xyz`

So NULLs are removed in Chrome in special URLs only. In the Edge and IE a NULL character
denotes the end of string.

One reason, why Chrome removes NULLs, I found in the source code:
https://chromium.googlesource.com/chromium/src/+/refs/tags/76.0.3803.1/url/url_canon_etc.cc#304
```c++
  for (int i = ref.begin; i < end; i++) {
    if (spec[i] == 0) {
      // IE just strips NULLs, so we do too.
      continue;
    }
```

But as we see NULLs in Chrome and IE (and Edge) are handled differently.

I suggest to follow Firefox, Safari and percent encode NULLs in fragments.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/440

Received on Thursday, 23 May 2019 16:19:52 UTC