W3C home > Mailing lists > Public > public-script-coord@w3.org > January to March 2012

Re: New full Unicode for ES6 idea

From: Wes Garland <wes@page.ca>
Date: Mon, 20 Feb 2012 16:42:35 -0500
Message-ID: <CAHB0tE4viRSSbn+WkU3sSpQO=5fErfm4DoxY9ZtdkeJmbLQrRQ@mail.gmail.com>
To: Allen Wirfs-Brock <allen@wirfs-brock.com>
Cc: Brendan Eich <brendan@mozilla.com>, public-script-coord@w3.org, Anne van Kesteren <annevk@opera.com>, mranney@voxer.com, es-discuss discussion <es-discuss@mozilla.org>
On 20 February 2012 16:00, Allen Wirfs-Brock <allen@wirfs-brock.com> wrote:

> My sense is that there are a fairly large variety of string data types
> could be use the existing ES5 string type as a target type and for which
> many of the String.prototuype.* methods would function just fine  The
> reason is that most of the ES5 methods don't impose this sort of semantic
> restriction of string elements.

To pick one out of a hat, it might be nice to be able to use non-Unicode
encodings, like GB 18030 or BIG5, and be able to use regexp methods on them
when the BRS is on. (I'm struggling to find a really real real-world
use-case, though)

Observation -- disallowing otherwise "legal" Unicode strings because they
contain code points d800-dfff has very concrete implementation benefits:
it's possible to use UTF-16 to represent the String's backing store.
Without this concession, I fear it may not be possible to implement BRS-on
without using a UTF-8 or full code point  backing store (or some
non-standard invention).

Maybe the answer is to consider (shudder) adding String-like utility
functions to the TypedArrays?  FWIW, CommonJS tried to go down this path
and it turned out to be a lot of work for very little benefit (if any).

But with the BRS flipped it would have to censor C "strings" passed to JS
> to ensure that unmatched surrogate pairs are present.

Only if the C strings are wide-character strings.  8-bit char strings are
fine, they map right onto Latin-1 in native Unicode as well as the UTF-16
and UCS-2 encodings.


Wesley W. Garland
Director, Product Development
PageMail, Inc.
+1 613 542 2787 x 102
Received on Monday, 20 February 2012 21:43:04 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:14:05 UTC