Re: New full Unicode for ES6 idea

Allen Wirfs-Brock wrote:
> I really don't think any Unicode semantics should be build into the 
> basic string representation.  We need to decide on a max element size 
> and Unicode motivates 21 bits, but it could be 32-bits.  Personally, 
> I've lived through enough address space exhaustion episodes in my 
> career be skeptical of "small" values like 2^21 being good enough for 
> the long term.

This does not seem justified to me as a future-proofing step. Instead, 
it invites my corollary to Postel's Law:

"If you are liberal in what you accept, others will utterly fail to be 
conservative in what they send."

to bite us, hard.

We do not want implementations today to accept non-Unicode code points 
under the BRS (also [D800-DFFF], IMHO). If tomorrow or on April 5, 2063 
when Vulcans arrive to make first contact, we need 32 bits, we can be 
liberal then. Old implementations will choke on Vulcan, Klingon, etc., 
but so they should! They cannot do better, and simply need to be upgraded.

OTOH if we are too liberal now, people will stuff non-Unicode code 
points into strings and it will be up to a receiving peer on the 
Internet to make it right (or wrong). Receiver-makes-it-wrong failed in 
the 80s RPC wars.

Postel's law is not about allowing unknown new bits to flow into 
containers. It is about unexpected combinations at higher message and 
header/field levels. Note that the IP protocol had to pick 4-byte 
addresses, and IPv6 could not be foreseen or usefully future-proofed by 
using wider fields without specific rules governing the use of the extra 
bytes.

/be

Received on Monday, 20 February 2012 19:03:59 UTC