- From: Matt Giuca <notifications@github.com>
- Date: Wed, 17 Jun 2020 00:46:40 -0700
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/pull/518/review/432128092@github.com>
@mgiuca approved this pull request.
LGTM.
I'm super happy with this change!! Thanks for taking the time to rework the algorithm based on @rmisev and my comments.
Note: I still have not had time to go through with a fine tooth comb and check that the members of the query percent-encode set are exactly right (for compatibility with the old version), but I checked over the rest of the changes. I trust you :)
>
<li><p>Let <var>output</var> be the empty string.</p></li>
- <li><p>For each <var>byte</var> of <var>bytes</var>, <a for=byte>percent-encode</a>
- <var>byte</var> and append the result to <var>output</var>.
+ <li>
+ <p>For each <var>byte</var> of <var>bytes</var>:
+
+ <ol>
+ <li><p>Let <var>isomorph</var> be a <a for=/>code point</a> whose <a for="code point">value</a>
+ is <var>byte</var>'s <a for=byte>value</a>.
+
+ <li><p>Assert: <var>percentEncodeSet</var> includes all non-<a>ASCII code points</a>.
Nit: Link to [Assert](https://infra.spec.whatwg.org/#assert)
> + <var>input</var>, and the <a>userinfo percent-encode set</a>
+ <td>U+0020
+ <td>"<code>%20</code>"
+ <tr>
+ <td>U+2261 (≡)
+ <td>"<code>%81%DF</code>"
+ <tr>
+ <td>U+203D (‽)
+ <td>"<code>%26%238253%3B</code>"
+ <tr>
+ <td><a for=string>Percent-encode after encoding</a> with <a>Shift_JIS</a>, <var>input</var>, the
+ <a>userinfo percent-encode set</a>, and true
+ <td>"<code>1+1 ≡ 2%20‽</code>"
+ <td>"<code>1+1+%81%DF+2%20%26%238253%3B</code>"
+ <tr>
+ <td rowspan=2><a for="code point">UTF-8 percent-encode</a> <var>input</var> using the
Nit: Perhaps move the UTF-8 examples up above Shift-JIS since they are the far more common example.
> @@ -246,9 +305,28 @@ a <var>percentEncodeSet</var>, run these steps:
<td>"<code>‽%25%2E</code>"
<td>0xE2 0x80 0xBD 0x25 0x2E
<tr>
- <td><a for="code point">UTF-8 percent-encode</a> <var>input</var> using the
+ <td rowspan=3><a for="code point">Percent-encode after encoding</a> with <a>Shift_JIS</a>,
I'd like to see @rmisev 's example added here, since it's a key case (i.e., all of these examples present would work in your old algorithm and your new one; your fixed algorithm differs only by that special case where some of the bytes that encode the code point are < 128, which seems to happen in ISO-2022-JP but not UTF-8 or Shift-JIS).
So, the example is:
- Operation: Percent-encode after encoding with ISO-2022-JP, _input_, and the userinfo percent-encode set
- Input: U+00A5 (¥)
- Output: "`%1B(J\%1B(B`"
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/pull/518#pullrequestreview-432128092
Received on Wednesday, 17 June 2020 07:46:53 UTC