W3C home > Mailing lists > Public > www-dom@w3.org > October to December 2009

Re: Changes to DOM3 Events Key Identifiers

From: Doug Schepers <schepers@w3.org>
Date: Fri, 30 Oct 2009 14:38:35 -0400
Message-ID: <4AEB32AB.5060400@w3.org>
To: Mark Davis ☕ <mark@macchiato.com>
CC: www-dom@w3.org, www-international@w3.org
Hi, Mark-

Mark Davis ☕ wrote (on 10/30/09 12:22 PM):
> I want to point out that Unicode code points can go up to hex 10FFFF.
> The standard for \u is exactly 4 digits, so that one can intermix with
> characters and know where it terminates. There are a couple of schemes
> that are used to extend this to up to 6 digits, and still know where to
> terminate.
>
> \UXXXXXXXX - C++, ICU
> \UXXXXXX - C#
> \u{xxxxxx} - Ruby
>
> There needs to be some mechanism for extending to 6 digits. It would be
> best to use one of the above rather than a new one. (My personal
> favorite is Ruby's.)

The reason the "\u" escaped character sequence was chosen was that it is 
the native ECMAScript escape notation, which is easy for browser-based 
applications to use directly (i.e. they can inject it directly into the 
markup as a character).

But, yes, this does have the cap of 4 digits, and I personally would 
prefer to use a different escape mechanism... but only if one or both of 
these 2 conditions obtains:

1) DOM3 Events implementations also update their Javascript engines to 
be able to process the additional escape sequence (e.g. one of the ones 
you mention above) in the same way they process the "\u" escape 
sequence.  This is the better long-term solution, and I'd hope ECMA TC39 
could be persuaded to add this to future ECMAScript specs.

2) Script authors could use a normalizing method (c.f. convertKeyValue) 
to "dumb down" the 6-digit escape sequence into the 4-digit format (by 
converting to surrogate pairs when necessary).

Javascript is becoming increasingly important, and so is the need for 
internationalized and localized language support.  With the new 
font-linking enablers (including my favorite, WOFF [1]), and i18n domain 
extension policy [2], we're going to see more use of languages I have no 
chance of ever understanding, and I want DOM3 Events and ECMAScript to 
be part of that.  I'd rather not introduce a not-very-good solution 
(UTF-16) that we know would not meet all the needs of the world 
community, just because of a (temporary?) circumstance with a vagary of 
Javascript.

But, I also want this spec interoperably implemented... so, any solution 
needs the buy-in of the implementers.  Any arguments on either side of 
the coin would help make a more informed decision.

BTW, you stated a preference for the Ruby-style delimited escaped 
characters... could you say why you prefer that?

[1] http://people.mozilla.com/~jkew/woff/woff-2009-09-16.html
[2] http://www.icann.org/en/announcements/announcement-30oct09-en.htm

Regards-
-Doug Schepers
W3C Team Contact, SVG and WebApps WGs
Received on Friday, 30 October 2009 18:38:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 22 June 2012 06:14:04 GMT