Re: [whatwg/url] Decoding in paths (#339)

No non-ASCII characters are allowed in hosts anyway, so for hosts with percent encoding they could be represented in ASCII, encoded with Punycode, or an invalid host. `github%2ecom` for example will be parsed and then stored as `github.com`, `%E4%B8%AD%E6%96%87.com` (`中文.com`) as `xn--fiq228c.com`, while `github%00.com` is an invalid host. Because they are stored already in a form that is guaranteed to contain only printable ASCII characters, the serialization algorithm does not include another step for percent encoding.

For path, they are encoded during parsing and subsequently stored in an encoded form, so encoding isn't necessary either when serializing. See [path state](https://url.spec.whatwg.org/#path-state) step 2.3:

> 3. UTF-8 percent encode *c* using the path percent-encode set, and append the result to *buffer*.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/339#issuecomment-316886429

Received on Friday, 21 July 2017 02:42:48 UTC