- From: Alwin Blok <notifications@github.com>
- Date: Fri, 14 Feb 2025 23:51:30 -0800
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/issues/855@github.com>
alwinb created an issue (whatwg/url#855) ### What is the issue with the URL Standard? This is a proposal to include a table, either as clarification, or (my preference) even as a full replacement for describing: * Percent encode sets * Valid vs invalid individual code points per component, and * Error correction behaviour of the above, Within a single small-ish table. For each component of an URL that contains a percent encoded string, we can describe _per codepoint_ its validity, error correction and encoding. A single code point is either: - v: Valid and included verbatim in the output URL. - E: (Escape) valid but nonetheless percent encoded. - T: (Tolerate) invalid, but nonetheless left untouched by the parser —resulting in an invalid URL as output. - F: (Fixed) invalid and fixed by the parser (and setters) by percent encoding the occurrence. - R: (Reject) Invalid and causing a hard error, so that they do not end up in output URLs. <img width="603" alt="Image" src="https://github.com/user-attachments/assets/31303696-587a-4aa1-a726-c34c01dc753a" /> Notes: - 'Other control' here is control-c0 ∪ del-c1 ∪ surrogate ∪ non-char - The apostrophe in the query is special cased for 'non-special' URLs where it is left untouched (ie. v: Valid) hence the superscript. Special query could also be broken out into a separate column. (If there have been changes to these sets in the last year or so, the table might be slightly out of date) -- Reply to this email directly or view it on GitHub: https://github.com/whatwg/url/issues/855 You are receiving this because you are subscribed to this thread. Message ID: <whatwg/url/issues/855@github.com>
Received on Saturday, 15 February 2025 07:51:34 UTC