[whatwg/url] Hoist "forbidden domain code point" check into "domain to ASCII" (Issue #818)

### What is the issue with the URL Standard?

When reading https://url.spec.whatwg.org/#concept-domain-to-ascii in isolation of https://url.spec.whatwg.org/#concept-host-parser (and without reading ICU4C's uts46.cpp first), it's not at all apparent that 1) STD3 rules are really a post-processing step to UTS 46 mapping despite UTS 46 making it look like a pre-processing step and that 2) the URL Standard's forbidden domain code point check is a _similar_ but different post-processing step that takes place _instead of_ STD3 post-processing.

The spec could be improved by hoisting the forbidden domain code point check from under https://url.spec.whatwg.org/#concept-host-parser into https://url.spec.whatwg.org/#concept-domain-to-ascii and adding a note that it is an ASCII filtering step that happens instead of STD3 filtering for compatibility with (whatever it is for compatibility with).

Even better if the Note listed what the difference between STD3 filtering and "forbidden domain code point" filtering is (16 rather surprising ASCII characters by my manual check) and the rationale for the differences.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/818
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/issues/818@github.com>

Received on Friday, 2 February 2024 13:29:39 UTC