Re: [csswg-drafts] [css‑fonts‑4] Add `emoji` as a keyword to `unicode‑range` (#4573) from Markus Scherer via GitHub on 2020-01-24 (public-css-archive@w3.org from January 2020)

From: Markus Scherer via GitHub <sysbot+gh@w3.org>
Date: Fri, 24 Jan 2020 17:49:14 +0000
To: public-css-archive@w3.org
Message-ID: <issue_comment.created-578231848-1579888153-sysbot+gh@w3.org>

Hi, I got cc'ed here...

As I think you found, ISO 15924 does not define which characters have which script. Use the Unicode properties sc=Script and scx=Script_Extensions for that. scx=Deva should be implemented as "set of code points whose Script_Extensions *contain* Deva", see UTS 18 (regex spec).

For emoji, there are several properties you could look at: http://www.unicode.org/reports/tr51/#Emoji_Properties
(Unicode 13 will hoist all of these into the UCD proper.)

Elsewhere in UTS 51 you can also find regexes for well-formed emoji sequences.

ICU has API to get the emoji character properties (per code point, or as a UnicodeSet).

FYI I work on Unicode/CLDR/ICU and am the current 15924 registrar.

-- 
GitHub Notification of comment by markusicu
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/4573#issuecomment-578231848 using your GitHub account

Received on Friday, 24 January 2020 17:49:15 UTC