- From: Anne van Kesteren <notifications@github.com>
- Date: Thu, 12 Jan 2023 08:29:35 -0800
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/issues/733@github.com>
From https://www.unicode.org/reports/tr46/#UseSTD3ASCIIRules: > There are a very small number of non-ASCII characters with the data file status disallowed_STD3_valid: > > U+2260 ( ≠ ) NOT EQUAL TO > U+226E ( ≮ ) NOT LESS-THAN > U+226F ( ≯ ) NOT GREATER-THAN > > Those characters are disallowed with UseSTD3ASCIIRules=true because the set of characters in their canonical decompositions are not entirely in the valid set ([Step 7](https://www.unicode.org/reports/tr46/#TableDerivationStep7) of the Table Derivation). However, they are allowed with UseSTD3ASCIIRules=false, because the base characters of their canonical decompositions, U+003D ( = ) EQUALS SIGN, U+003C ( < ) LESS-THAN SIGN, and U+003E ( > ) GREATER-THAN SIGN, are each valid under that option. If an implementation uses UseSTD3ASCIIRules=false but disallows any of these three ASCII characters, then it must also disallow the corresponding precomposed character for its negation. We allow `=`, but `<` and `>` are forbidden. All of the three non-ASCII code points listed above work fine in WebKit and I personally might not see the problem as strongly as UTS46 does. I added tests for them in https://github.com/web-platform-tests/wpt/pull/37907. (The tests reflect the status quo.) Thoughts? cc @karwa @ricea @achristensen07 @valenting -- Reply to this email directly or view it on GitHub: https://github.com/whatwg/url/issues/733 You are receiving this because you are subscribed to this thread. Message ID: <whatwg/url/issues/733@github.com>
Received on Thursday, 12 January 2023 16:29:47 UTC