Re: [whatwg/url] Need an "unreserved" character set (and better define how to percent-encode arbitrary strings) (#369)

My personal take is that we should work to solve these issues in order of most practical to most theoretical, with corresponding urgency. Here I am using @mgiuca's enumeration in https://github.com/whatwg/url/issues/369#issuecomment-359614523. So I would suggest:

1. We should solve whatwg/html#3377 first, ASAP since it's a mismatch with browsers and potential interop problem. In doing so we'll define a new encode set, either in HTML or URL.
2. Then we can move the encode set into URL (if it's not there already), and give more general advice along the lines of "when inserting arbitrary strings into URLs, use this encode set"
3. Then we can delve into issues around URL processors (like servers), URL renderers (like browser URL bars), URL equivalence, and what their relationship to encoding should be. I don't feel like I understand this space well enough, but I think if I re-read @mgiuca's comments I would have ideas of concrete steps here. But I think we should leave this for last, once we've laid the right foundation, as it's a tricky area.

(3) is the only area where I think we would make normative changes to URL parsing, based on the principle that URLs should be equivalent if and only if they parse the same. (Which IMO is a very good principle.)

@mgiuca, @annevk, does this make any sense as an approach? Although I suppose @annevk has already pinged the other implementers for their take on (3), so maybe we're just going straight for solving everything at once :)

In general I appreciate @mgiuca's thinking about how this standard applies in a larger context, and think we should definitely work on incorporating such suggestions.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/369#issuecomment-359698527

Received on Tuesday, 23 January 2018 07:21:12 UTC