[whatwg/url] Forbidden host code points (#214)

In https://github.com/whatwg/url/pull/185#issuecomment-274251275 @achristensen07 complained about the host code point restrictions I was adding for opaque hosts. I figured I'd explain the rules for code point restrictions in hosts in general once here and then we can either decide to agree or fiddle with the specifics.

Currently we have the following restrictions for non-opaque hosts. I listed the justification for each on the right:

* U+0000 (generally problematic)
* U+0009 (stripped when parsing URLs, would create reparsing issues)
* U+000A (stripped when parsing URLs, would create reparsing issues)
* U+000D (stripped when parsing URLs, would create reparsing issues)
* U+0020 (creates copy-and-paste issues, see #125)
* "#" (would create reparsing issues)
* "%" (would create reparsing issues due to host percent-decoding)
* "/" (would create reparsing issues)
* ":" (would create reparsing issues)
* "?" (would create reparsing issues)
* "@" (would create reparsing issues)
* "[" (would create reparsing issues)
* "\" (would create reparsing issues)
* "]" (would create reparsing issues)

Now for non-opaque hosts I took this list and removed the code points that were no longer problematic. And those are "%" (opaque hosts have no percent decoding), "[" (no IPv6), "\" (no special backslash handling), and "]" (no IPv6).

The reason to be maximally liberal is given in https://github.com/whatwg/url/issues/159#issuecomment-271674428.

@achristensen07 does this help or do you have the same concerns still? And if so, what would you do?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/214

Received on Monday, 23 January 2017 15:34:18 UTC