RE: Character Encoding Question from John Boyer on 2000-11-30 (w3c-ietf-xmldsig@w3.org from October to December 2000)

From: John Boyer <jboyer@PureEdge.com>
Date: Wed, 29 Nov 2000 17:18:03 -0800
To: "Paul Hoffman / IMC" <phoffman@imc.org>, <w3c-ietf-xmldsig@w3.org>
Message-ID: <BFEDKCINEPLBDLODCODKMELHCGAA.jboyer@PureEdge.com>

Paul,

Actually, in the sentence directly after the one from which you cited, I
quote:

"The Unicode 16-bit encoding form is identical to the ISO/IEC 10646
transformation format UTF-16."

As to 'badly' misreading the UTF-8 spec, perhaps you could define how this
differs in your mind from simply misreading.  Your characterization seems a
bit harsh considering I've already said that I don't have any access to
UCS-2 documentation, so I am having to guess from all of the shrouded half
statements in the documents that I do have.  The examples in Section 4 do in
fact have triplets of UCS-2 characters that represent 'something', and I
have no way of knowing really whether this is considered to be a single
defined sequence as far as UCS-2 is concerned or whether it represents
characters in a three character word, or whether two of the three 16-bit
values represent a single thing.

It would be more helpful, since you seem to know, to tell us whether or not
UCS-2 == Unicode, which is the single most important bit of information we
need.  If UCS-2 != Unicode, does UCS-2 have the same representation power as
UCS-4?  This would be the second most important bit of information we need.

John Boyer
Team Leader, Software Development
Distributed Processing and XML
PureEdge Solutions Inc.
Creating Binding E-Commerce
v: 250-479-8334, ext. 143  f: 250-479-3772
1-888-517-2675   http://www.PureEdge.com <http://www.pureedge.com/>

Received on Wednesday, 29 November 2000 20:18:15 UTC