Unicode Character 'PILE OF POO' (U+1F4A9) and validator test suite

Hi

Is the Unicode character U+1F4A9 used in the conformance checker test suite for URLs really invalid? It’s marked as novalid in test suite files like:

conformance-checkers/html/elements/a/href/userinfo-username-contains-pile-of-poo-novalid.html

In RFC 3987 this character is listed in the 10000-1FFFD  range in the iuserinfo  -> iunreserved -> ucschar production:

iuserinfo      = *( iunreserved / pct-encoded / sub-delims / ":" )

iunreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~" / ucschar

   ucschar        = %xA0-D7FF / %xF900-FDCF / %xFDF0-FFEF
                  / %x10000-1FFFD / %x20000-2FFFD / %x30000-3FFFD
                  / %x40000-4FFFD / %x50000-5FFFD / %x60000-6FFFD
                  / %x70000-7FFFD / %x80000-8FFFD / %x90000-9FFFD
                  / %xA0000-AFFFD / %xB0000-BFFFD / %xC0000-CFFFD
                  / %xD0000-DFFFD / %xE1000-EFFFD

In the Whatwg URL standard it’s listed as a valid URL code point, and will be converted to percent encoding during the normalisation process, but won’t flag an error. See
https://url.spec.whatwg.org/#url-code-points

https://url.spec.whatwg.org/#authority-state


Best Regards
Mark

Mark Rogers - mark.rogers@powermapper.com<mailto:mark.rogers@powermapper.com>
PowerMapper Software Ltd - www.powermapper.com<http://www.powermapper.com>
Registered in Scotland No 362274 Quartermile 2 Edinburgh EH3 9GL

Received on Saturday, 8 November 2014 19:59:45 UTC