Re: [whatwg/url] ContextJ (RFC 5892) is Security Theater (Issue #776)

> Even before getting to ContextJ, which affects U+200D from https://www.unicode.org/Public/emoji/16.0/emoji-zwj-sequences.txt , the mapping stage removes U+FE0F, which occurs in multiple sequences there.

The `FE0E` or `FE0F` don't impact the emoji though. All emoji are uniquely identifiable regardless of variation selector placement.  Those characters can (and probably should) be stripped for canonicalization.

Emoji are registered as punycode, which is ASCII.   However, URL parsing libraries are decoding the punycode, running UTS-46, spotting a ZWJ, and then failing due to CheckJoiners (see screenshot above.)

Either CheckJoiners should be false or it should be false when decoding a name that is already punycode.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/776#issuecomment-2522789308
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/issues/776/2522789308@github.com>

Received on Friday, 6 December 2024 10:41:52 UTC