- From: Mark Davis <mark.davis@jtcsv.com>
- Date: Tue, 29 Jul 2003 08:03:05 -0700
- To: "Jungshik Shin" <jshin@i18nl10n.com>, "Joseph Reagle" <reagle@w3.org>
- Cc: "Martin Duerst" <duerst@w3.org>, <w3c-ietf-xmldsig@w3.org>, <w3c-i18n-ig@w3.org>, <schererm@us.ibm.com>
My understanding is that the ECMAScript Unicode improvements have been effectively put on hold, due to the loss of one of the leading contributors due to the Netscape layoffs. I'm cc'ing Markus Scherer in case he can add anything on that topic. Mark __________________________________ http://www.macchiato.com ► “Eppur si muove” ◄ ----- Original Message ----- From: "Jungshik Shin" <jshin@i18nl10n.com> To: "Joseph Reagle" <reagle@w3.org> Cc: "Martin Duerst" <duerst@w3.org>; <w3c-ietf-xmldsig@w3.org>; <w3c-i18n-ig@w3.org> Sent: Tuesday, July 29, 2003 06:19 Subject: Re: Request for clarification on Canonical XML > > On Mon, 28 Jul 2003, Joseph Reagle wrote: > > > [[[ > > Note: Canonical XML is an octet sequence resulting from characters, from the > > UCS character domain, encoded in UTF-8. Creating a deterministic octet > > Two successive 'from ...'s just linked by a comma are a bit confusing, > aren't they? > > ... > > some applications might want a canonical form of XML in a different > > encoding, or one that is simply a sequence of characters, without concern > > for its encoding. For example, it may be appropriate to choose UTF-16 > > rather than UTF-8 as the encoding of an API in a programming language using > > UTF-16 to represent Unicode strings, such as Java or Python. Or, one might > .... > > Python's use of UTF-16(actually UCS-2) for the internal string > represenation appears to be going away. See > http://mail.nl.linux.org/linux-utf8/2003-07/msg00113.html > Even if it's not going away, Python doesn't seem to be > a typical case to take an example of. > See also http://www.egenix.com/files/python/Unicode-EPC2002-Talk.pdf > (page 15 and page 27). When ECMAscript is updated to deal with UTF-16 > (as it is implemented and specified, its support of UTF-16 as opposed to > UCS-2 is at most patchy) as planned, it might be a good example. > On the other hand, as is well known, there are widely used/popular APIs > that (exclusively) use UTF-16 (Win32 W APIs and ICU) that may be cited > if Java alone is considered too 'lonely' :-) > > Jungshik > > P.S. I'm humbled and grateful that Martin and Tex welcomed me to the list. > >
Received on Tuesday, 29 July 2003 11:03:08 UTC