Re: Request for clarification on Canonical XML

My understanding is that the ECMAScript Unicode improvements have been
effectively put on hold, due to the loss of one of the leading
contributors due to the Netscape layoffs. I'm cc'ing Markus Scherer in
case he can add anything on that topic.

Mark
__________________________________
http://www.macchiato.com
►  “Eppur si muove” ◄

----- Original Message ----- 
From: "Jungshik Shin" <jshin@i18nl10n.com>
To: "Joseph Reagle" <reagle@w3.org>
Cc: "Martin Duerst" <duerst@w3.org>; <w3c-ietf-xmldsig@w3.org>;
<w3c-i18n-ig@w3.org>
Sent: Tuesday, July 29, 2003 06:19
Subject: Re: Request for clarification on Canonical XML


>
> On Mon, 28 Jul 2003, Joseph Reagle wrote:
>
> > [[[
> > Note: Canonical XML is an octet sequence resulting from
characters, from the
> > UCS character domain, encoded in UTF-8. Creating a deterministic
octet
>
>   Two successive 'from ...'s just linked by a comma are a bit
confusing,
> aren't they?
>
> ...
> > some applications might want a canonical form of XML in a
different
> > encoding, or one that is simply a sequence of characters, without
concern
> > for its encoding. For example, it may be appropriate to choose
UTF-16
> > rather than UTF-8 as the encoding of an API in a programming
language using
> > UTF-16 to represent Unicode strings, such as Java or Python. Or,
one might
> ....
>
>   Python's use of UTF-16(actually UCS-2) for the internal string
> represenation appears to be going away.  See
> http://mail.nl.linux.org/linux-utf8/2003-07/msg00113.html
> Even if it's not going away, Python doesn't seem to be
> a typical case to take an example of.
> See also http://www.egenix.com/files/python/Unicode-EPC2002-Talk.pdf
> (page 15 and page 27). When ECMAscript is updated to deal with
UTF-16
> (as it is implemented and specified, its support of UTF-16 as
opposed to
> UCS-2 is at most patchy) as planned, it might be a good example.
> On the other hand, as is well known, there are widely used/popular
APIs
> that (exclusively) use UTF-16 (Win32 W APIs and ICU) that may be
cited
> if Java alone is considered too 'lonely' :-)
>
>   Jungshik
>
> P.S. I'm humbled and grateful that Martin and Tex welcomed me to the
list.
>
>

Received on Tuesday, 29 July 2003 11:03:08 UTC