[whatwg/url] Change query state slightly to better deal with non-UTF-8 encodings (#386)

If the input to the URL parser contains code points outside the non-UTF-8 encoding's value space and the URL parser was invoked using a non-UTF-8 encoding, then those code points end up as &#...;.

The problem is that &, #, and ; are also URL separators, but the previous algorithm would only encode #. This ensures that & and ; are also encoded, as some browsers already do, but only if they came about as the result of the encode operation.

Tests: [we need to make a number of test changes for this]
You can view, comment on, or merge this pull request online at:

  https://github.com/whatwg/url/pull/386

-- Commit Summary --

  * Editorial: avoid setting encoding multiple times
  * Change query state slightly to better deal with non-UTF-8 encodings

-- File Changes --

    M url.bs (69)

-- Patch Links --

https://github.com/whatwg/url/pull/386.patch
https://github.com/whatwg/url/pull/386.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/pull/386

Received on Wednesday, 9 May 2018 08:56:18 UTC