[whatwg/url] Percent encode NULLs in fragments (#440) from Rimas Misevičius on 2019-05-23 (public-webapps-github@w3.org from May 2019)

From: Rimas Misevičius <notifications@github.com>
Date: Thu, 23 May 2019 16:19:26 +0000 (UTC)
To: whatwg/url <url@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <whatwg/url/issues/440@github.com>

Currently URL parser removes NULL characters from fragments. But is this really needed?
My tests show that NULLs are percent encoded in most browsers (I tested URLs "`https://example.com/#abc\u0000xyz`" and "`non-spec://example.com/#abc\u0000xyz`"):

Browser  | Special URL's hash | Non-special URL's hash
---------|--------------------|----------------------
Chrome   | `#abcxyz`          | `#abc%00xyz`
Edge, IE | `#abc`             | `#abc`
Firefox  | `#abc%00xyz`       | `#abc%00xyz`
Safari   | `#abc%00xyz`       | `#abc%00xyz`

So NULLs are removed in Chrome in special URLs only. In the Edge and IE a NULL character
denotes the end of string.

One reason, why Chrome removes NULLs, I found in the source code:
https://chromium.googlesource.com/chromium/src/+/refs/tags/76.0.3803.1/url/url_canon_etc.cc#304
```c++
  for (int i = ref.begin; i < end; i++) {
    if (spec[i] == 0) {
      // IE just strips NULLs, so we do too.
      continue;
    }
```

But as we see NULLs in Chrome and IE (and Edge) are handled differently.

I suggest to follow Firefox, Safari and percent encode NULLs in fragments.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/440

Received on Thursday, 23 May 2019 16:19:52 UTC