Re: [whatwg/url] Editorial: make everything use percent-encode sets (#518)

@mgiuca approved this pull request.

LGTM.

I'm super happy with this change!! Thanks for taking the time to rework the algorithm based on @rmisev and my comments.

Note: I still have not had time to go through with a fine tooth comb and check that the members of the query percent-encode set are exactly right (for compatibility with the old version), but I checked over the rest of the changes. I trust you :)

>  
  <li><p>Let <var>output</var> be the empty string.</p></li>
 
- <li><p>For each <var>byte</var> of <var>bytes</var>, <a for=byte>percent-encode</a>
- <var>byte</var> and append the result to <var>output</var>.
+ <li>
+  <p>For each <var>byte</var> of <var>bytes</var>:
+
+  <ol>
+   <li><p>Let <var>isomorph</var> be a <a for=/>code point</a> whose <a for="code point">value</a>
+   is <var>byte</var>'s <a for=byte>value</a>.
+
+   <li><p>Assert: <var>percentEncodeSet</var> includes all non-<a>ASCII code points</a>.

Nit: Link to [Assert](https://infra.spec.whatwg.org/#assert)

> +   <var>input</var>, and the <a>userinfo percent-encode set</a>
+   <td>U+0020
+   <td>"<code>%20</code>"
+  <tr>
+   <td>U+2261 (≡)
+   <td>"<code>%81%DF</code>"
+  <tr>
+   <td>U+203D (‽)
+   <td>"<code>%26%238253%3B</code>"
+  <tr>
+   <td><a for=string>Percent-encode after encoding</a> with <a>Shift_JIS</a>, <var>input</var>, the
+   <a>userinfo percent-encode set</a>, and true
+   <td>"<code>1+1 ≡ 2%20‽</code>"
+   <td>"<code>1+1+%81%DF+2%20%26%238253%3B</code>"
+  <tr>
+   <td rowspan=2><a for="code point">UTF-8 percent-encode</a> <var>input</var> using the

Nit: Perhaps move the UTF-8 examples up above Shift-JIS since they are the far more common example.

> @@ -246,9 +305,28 @@ a <var>percentEncodeSet</var>, run these steps:
    <td>"<code>‽%25%2E</code>"
    <td>0xE2 0x80 0xBD 0x25 0x2E
   <tr>
-   <td><a for="code point">UTF-8 percent-encode</a> <var>input</var> using the
+   <td rowspan=3><a for="code point">Percent-encode after encoding</a> with <a>Shift_JIS</a>,

I'd like to see @rmisev 's example added here, since it's a key case (i.e., all of these examples present would work in your old algorithm and your new one; your fixed algorithm differs only by that special case where some of the bytes that encode the code point are < 128, which seems to happen in ISO-2022-JP but not UTF-8 or Shift-JIS).

So, the example is:
- Operation: Percent-encode after encoding with ISO-2022-JP, _input_, and the userinfo percent-encode set
- Input: U+00A5 (¥)
- Output: "`%1B(J\%1B(B`"

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/pull/518#pullrequestreview-432128092

Received on Wednesday, 17 June 2020 07:46:53 UTC