Re: [whatwg/url] [editorial] Rephrase encoding note to make the implications clearer. (PR #804) from Anne van Kesteren on 2025-03-25 (public-webapps-github@w3.org from March 2025)

From: Anne van Kesteren <notifications@github.com>
Date: Tue, 25 Mar 2025 00:37:11 -0700
To: whatwg/url <url@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <whatwg/url/pull/804/review/2712695931@github.com>

@annevk commented on this pull request.



> @@ -2038,8 +2038,9 @@ and <a>code points</a> in the range U+00A0 to U+10FFFD, inclusive, excluding <a>
 <!-- IRI also excludes the ranges U+E000 to U+F8FF, U+FFF0 to U+FFFD, and U+E0000 to U+E09FF, all
      inclusive. We don't to align with HTML. -->
 
-<p class=note>Code points greater than U+007F DELETE will be converted to
-<a lt="percent-encoded byte">percent-encoded bytes</a> by the <a>URL parser</a>.
+<p class=note>For historical reasons, rather than storing codepoints and [=byte/percent-encoding=]
+to ASCII for serialization, URLs instead store their value as ASCII internally, eagerly converting
+code points greater than U+007F DELETE to [=percent-encoded bytes=] during [=URL parser|parsing=].

Hmm. 1) It's not for "historical reasons". 2) This section is really about writing URLs, it isn't really about their internal representation at all. That's section 4.1 and that already makes it clear most components are ASCII strings.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/pull/804#discussion_r2011489073
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/pull/804/review/2712695931@github.com>

Received on Tuesday, 25 March 2025 07:37:15 UTC