Re: New full Unicode for ES6 idea from Brendan Eich on 2012-02-28 (public-script-coord@w3.org from January to March 2012)

From: Brendan Eich <brendan@mozilla.com>
Date: Tue, 28 Feb 2012 12:49:39 +0100
To: Wes Garland <wes@page.ca>
CC: Norbert Lindenberg <ecmascript@norbertlindenberg.com>, "public-script-coord@w3.org" <public-script-coord@w3.org>, mranney@voxer.com, es-discuss <es-discuss@mozilla.org>
Message-ID: <4F4CBF53.6030007@mozilla.com>

Wes Garland wrote:
> If four-byte escapes are statically rejected in BRS-on, we have a 
> problem -- we should be able to use old code that runs in either mode 
> unchanged when said code only uses characters in the BMP.

We've been over this and I conceded to Allen that "four-byte escapes" 
(I'll use \uXXXX to be clear from now on) must work as today with 
BRS-on. Otherwise we make it hard to impossible to migrate code that 
knows what it is doing with 16-bit code units that round-trip properly.

> Accepting both 4 and 6 byte escapes is a problem, though -- what is 
> "\u123456".length?  1 or 3?

This is not a problem. We want .length to distribute across 
concatenation, so 3 is the only answer and in particular ("\u1234" + 
"\u5678").length === 2 irrespective of BRS.

> If we accept "\u1234" in BRS-on as a string with length 5 -- as we do 
> today in ES5 with "\u123".length===4 -- we give developers a way to 
> feature-test and conditionally execute code, allowing libraries to run 
> with BRS-on and BRS-off.

Feature-testing should be done using a more explicit test. API TBD, but 
I don't think breaking "\uXXXX" with BRS on is a good idea.

I agree with you that Roozbeh is hardly used, so it can take the hit of 
having to feature-test the BRS. The much more common case today is JS 
code that blithely ignores non-BMP characters that make it into strings 
as pairs, treating them blindly as two "characters" (ugh; must purge 
that "c-word" abusage from the spec).

/be

Received on Tuesday, 28 February 2012 11:50:54 UTC