Re: [whatwg/url] Use different percent-encode sets for simple/non-simple scheme fragment encoding (#597) from Timothy Gu on 2021-05-05 (public-webapps-github@w3.org from May 2021)

From: Timothy Gu <notifications@github.com>
Date: Wed, 05 May 2021 01:35:54 -0700
To: whatwg/url <url@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <whatwg/url/issues/597/832515214@github.com>

I tested some other non-browser URL parsers and here's what I found:

* Node.js's [legacy URL parser](https://nodejs.org/docs/latest-v16.x/api/url.html#url_legacy_url_api) follows the current spec
* Ruby's [URI](https://ruby-doc.org/stdlib-2.7.2/libdoc/uri/rdoc/URI.html) follows the current spec; in fact it doesn't even parse `# <>` or `#%20<>`
* Go's [net/url](https://golang.org/pkg/net/url/) follows the current spec
* curl's [URL API](https://everything.curl.dev/libcurl/url) always encodes the space in the fragment, but leaves `<>` or `%3C%3E` in the original encoding untouched (exactly same as Firefox)

Notably, none of these parsers distinguish between special and non-special schemes.

Here's the relevant places in Firefox and Chrome that encode the fragment (I think):

- Firefox: https://searchfox.org/mozilla-central/rev/6371054f6260a5f8844846439297547f7cfeeedd/netwerk/base/nsSimpleURI.cpp#444-445
- Chrome: https://source.chromium.org/chromium/chromium/src/+/main:url/url_canon_pathurl.cc;l=65-75;drc=b0dcfd9b5f7d30aaa9b4789e4b4bb15750eca264

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/597#issuecomment-832515214

Received on Wednesday, 5 May 2021 08:36:07 UTC