- From: Phillips, Addison <addison@lab126.com>
- Date: Tue, 14 May 2019 00:38:33 +0000
- To: "Eric Prud'hommeaux" <eric@w3.org>
- CC: "public-i18n-core@w3.org" <public-i18n-core@w3.org>, "binji@google.com" <binji@google.com>
> > > > Supplementary characters (that is, those beyond the BMP) are not an issue. > However, isolated (that is, *unpaired*) surrogate code units are permitted in > JavaScript strings. The question is how to deal with them (not allowing them > would be fine by me--for security they are often replaced by U+FFFD). So > the question is whether you're permitted to have a string like "\uD800 > ABCDEFG \uD800\uDC00\uD800" (which starts and ends with an unpaired > surrogate, but has a valid surrogate pair in the middle). > > I'd say that > [[ > Names are sequences of characters, which are scalar values as defined by > Unicode (Section 2.4). > ]] > says no, but I can't lay my hands on tests to make sure implementations barf > on it. (Part of the problem is that WASM tests input conditions are > synthesized in a browser so it may be difficult to create such a string on some > platforms.) > If they are synthesized using JavaScript, it should be as simple as: String.fromCharCode(0xd800, 0xd800, 0xd00); // three isolated surrogates Addison
Received on Tuesday, 14 May 2019 00:39:03 UTC