- From: Brendan Eich <brendan@mozilla.com>
- Date: Mon, 20 Feb 2012 08:20:07 -0800
- To: Allen Wirfs-Brock <allen@wirfs-brock.com>
- CC: Gavin Barraclough <barraclough@apple.com>, public-script-coord@w3.org, Anne van Kesteren <annevk@opera.com>, mranney@voxer.com, es-discuss discussion <es-discuss@mozilla.org>
Allen Wirfs-Brock wrote:
>> Last year we dispensed with the binary data hacking in strings use-case. I don't see the hardship. But rather than throw exceptions on concatenation I would simply eliminate the ability to spell code units with "\uXXXX" escapes. Who's with me?
>
> I think we need to be careful not to equate the syntax of ES string literals with the actual encoding space of string elements.
I agree, which is why I'm saying with the BRS set, we should forbid 
"\uXXXX" since that is not a code point rather a code unit.
>    Whether you say "\ud800" or "\u{00d800}", or call a function that does full-unicode to UTF-16 encoding, or simply create a string from file contents you may end up with string elements containing upper or lower half surrogates.
I don't agree in the case of "\u{00d800}". That's simply an illegal code 
point, not a code unit (upper or lower half). We can reject it statically.
>      Eliminating the "\uXXXX" syntax really doesn't change anything regarding actual string processing.
True, but not my point!
> What it might do, however, is eliminate the ambiguity about the intended meaning of  "\uD800\uDc00" in legacy code.
And arising from concatenations, avoiding the loss of Gavin's 
distributive .length property.
> If "full unicode string mode" only supported \u{} escapes then existing code that uses \uXXXX would have to be updated before it could be used in that mode.  That might be a good thing.
My point! ;-)
/be
Received on Monday, 20 February 2012 16:20:44 UTC