[whatwg/url] javascript: URL parsing (#307)

Noticed via https://github.com/tmpvar/jsdom/issues/1836 by @tsirolnik

Given

```
const s = 'javascript:window.location.replace("https://whatsoever.com/?a=b&c=5&x=y")';
```

then there are two separate problems:

- `new URL(s).pathname` will give `window.location.replace("https://whatsoever.com/`; the rest will end up in query etc. This seems unexpected, but not fatal, at least.
- `new URL(s).href` will give `javascript:window.location.replace("https://whatsoever.com/?a=b&c=5&x=y%22)`, i.e. it percent encodes the rest of the string, including the closing quote.

The second of these is especially bad, because per [HTML's navigate algorithm](https://html.spec.whatwg.org/#navigate), how javascript: URLs are executed is by serializing them to a string then stripping the leading `javascript:`.

I think `javascript:` URLs need to be special-cased in the URL parser, unfortunately.

I suspect a very similar bug report could apply to `data:` URLs. Although I just rediscovered today that data: URLs are very underspecified, per https://simonsapin.github.io/data-urls/, so maybe that's a separate can of worms...

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/307

Received on Wednesday, 3 May 2017 15:50:11 UTC