- From: Brendan Eich <brendan@mozilla.com>
- Date: Mon, 20 Feb 2012 08:20:07 -0800
- To: Allen Wirfs-Brock <allen@wirfs-brock.com>
- CC: Gavin Barraclough <barraclough@apple.com>, public-script-coord@w3.org, Anne van Kesteren <annevk@opera.com>, mranney@voxer.com, es-discuss discussion <es-discuss@mozilla.org>
Allen Wirfs-Brock wrote: >> Last year we dispensed with the binary data hacking in strings use-case. I don't see the hardship. But rather than throw exceptions on concatenation I would simply eliminate the ability to spell code units with "\uXXXX" escapes. Who's with me? > > I think we need to be careful not to equate the syntax of ES string literals with the actual encoding space of string elements. I agree, which is why I'm saying with the BRS set, we should forbid "\uXXXX" since that is not a code point rather a code unit. > Whether you say "\ud800" or "\u{00d800}", or call a function that does full-unicode to UTF-16 encoding, or simply create a string from file contents you may end up with string elements containing upper or lower half surrogates. I don't agree in the case of "\u{00d800}". That's simply an illegal code point, not a code unit (upper or lower half). We can reject it statically. > Eliminating the "\uXXXX" syntax really doesn't change anything regarding actual string processing. True, but not my point! > What it might do, however, is eliminate the ambiguity about the intended meaning of "\uD800\uDc00" in legacy code. And arising from concatenations, avoiding the loss of Gavin's distributive .length property. > If "full unicode string mode" only supported \u{} escapes then existing code that uses \uXXXX would have to be updated before it could be used in that mode. That might be a good thing. My point! ;-) /be
Received on Monday, 20 February 2012 16:20:44 UTC