W3C home > Mailing lists > Public > public-script-coord@w3.org > January to March 2012

Re: New full Unicode for ES6 idea

From: Glenn Adams <glenn@skynav.com>
Date: Sat, 25 Feb 2012 10:26:11 -0700
Message-ID: <CACQ=j+duYnipkROuBXoiReiojwOL9w3mK0_xjruc27F+uT7xoA@mail.gmail.com>
To: Boris Zbarsky <bzbarsky@mit.edu>
Cc: public-script-coord@w3.org
On Sat, Feb 25, 2012 at 9:52 AM, Boris Zbarsky <bzbarsky@mit.edu> wrote:

> On 2/25/12 11:19 AM, Glenn Adams wrote:
>> To answer Anne, I concur that Unicode scalar values (also known as
>> Unicode code points) as opposed to encoded coding elements, i.e., code
>> units, e.g., 16-bit units of UTF-16, are the correct choice. Grapheme
>> clusters remain in the text processing (i.e., abstract character)
>> domain, and not the encoded character domain.
> I believe Anne's point is that we are in fact talking about text
> processing here, throughout this discussion, so grapheme clusters seem like
> the right thing to be talking about...

My apologies for not having followed this long thread (just joined this ML
in fact), but I did read the original posting [1], and it appears to be
related to a simple idea: to transition from the use of 16-bit encoding
units to unicode scalar values as the access units for ES strings.


On its own, I support such a transition. However, I believe it would be
unwise to introduce graphemes or grapheme clusters into this transition.

The motivation for making a transition can simply be stated as a desire to
easily support all Unicode abstract characters in a simple string construct
without having to deal with surrogate pairs.

Of course, a secondary motivation is to simplify (the domain and range of)
text processing functions, but that should be a second order determiner,
and I would suggest that introducing grapheme clusters (which certainly do
have a role at the text processing layer) should best be avoided in
characterizing this possible transition.

Received on Saturday, 25 February 2012 17:26:59 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:14:05 UTC