- From: Boris Zbarsky <notifications@github.com>
- Date: Wed, 05 Apr 2017 14:04:48 -0700
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/issues/291@github.com>
The standard way, going back to at least the mid-90s, to mark up URLs in text is `<url>`. This, of course, relies on unescaped `>` not being allowed in URLs. This is clearly stated, with exactly this rationale, in RFC 1738 section 2.2. The URL standard should have similar provisions. I don't know what that should mean for URL _parsing_, but in terms of serialization '>' should always be escaped in URLs, imo. I just tested browser behavior, and: * Firefox consistently escapes '>' in path, userinfo, query, fragment. '>' in host or port cause parsing failure. * Safari escapes '>' in path, userinfo, query. It allows '>' unchanged in host and fragment. '>' in port causes parsing failure. * Chrome escapes '>' in path, userinfo, query, host. It allows '>' unchanged in fragment. '>' in port causes parsing failure. * Edge escapes '>' in path and host. It allows '>' unchanged in fragment and query. '>' in port causes parsing failure. Presence of userinfo causes parsing failure no matter what. Testcase used: <pre><script> var strs = [ "http://test>test/foo\\bar", "http://a>b@test/foo\\bar", "http://test/foo\\bar/#a>b", "http://test/foo\\bar/?a=c>d", "http://test:2>3/foo\\bar", "http://test/foo>bar\\baz", ]; for (var str of strs) { var a = document.createElement("a"); a.setAttribute("href", str); var href; try { href = a.href; } catch(e) { href = "href getter threw"; } var url; try { url = (new URL(str).href); } catch(e) { url = "constructor threw"; } document.writeln(str, " -- ", href, " -- ", url); } </script> with the `\\` bits in there a way to tell whether parsing failed in the href case. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/url/issues/291
Received on Wednesday, 5 April 2017 21:05:22 UTC