- From: Mark Rogers <mark.rogers@powermapper.com>
- Date: Sat, 8 Nov 2014 13:59:34 -0600
- To: "www-validator@w3.org" <www-validator@w3.org>
- Message-ID: <1F68EA0E0CBFBE44A9A64274E1AC01A122F74256A7@DFW1MBX23.mex07a.mlsrvr.com>
Hi Is the Unicode character U+1F4A9 used in the conformance checker test suite for URLs really invalid? It’s marked as novalid in test suite files like: conformance-checkers/html/elements/a/href/userinfo-username-contains-pile-of-poo-novalid.html In RFC 3987 this character is listed in the 10000-1FFFD range in the iuserinfo -> iunreserved -> ucschar production: iuserinfo = *( iunreserved / pct-encoded / sub-delims / ":" ) iunreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" / ucschar ucschar = %xA0-D7FF / %xF900-FDCF / %xFDF0-FFEF / %x10000-1FFFD / %x20000-2FFFD / %x30000-3FFFD / %x40000-4FFFD / %x50000-5FFFD / %x60000-6FFFD / %x70000-7FFFD / %x80000-8FFFD / %x90000-9FFFD / %xA0000-AFFFD / %xB0000-BFFFD / %xC0000-CFFFD / %xD0000-DFFFD / %xE1000-EFFFD In the Whatwg URL standard it’s listed as a valid URL code point, and will be converted to percent encoding during the normalisation process, but won’t flag an error. See https://url.spec.whatwg.org/#url-code-points https://url.spec.whatwg.org/#authority-state Best Regards Mark Mark Rogers - mark.rogers@powermapper.com<mailto:mark.rogers@powermapper.com> PowerMapper Software Ltd - www.powermapper.com<http://www.powermapper.com> Registered in Scotland No 362274 Quartermile 2 Edinburgh EH3 9GL
Received on Saturday, 8 November 2014 19:59:45 UTC