W3C home > Mailing lists > Public > public-exi@w3.org > June 2008

RE: character encoding

From: Taki Kamiya <tkamiya@us.fujitsu.com>
Date: Thu, 19 Jun 2008 16:19:42 -0700
To: "'Melanie Stallings'" <ms.protrain@yahoo.com>, <public-exi@w3.org>
Message-ID: <80308C97736142CE9932C4378588435A@catarojp>

Hi Melanie,

Sorry for the belated response.

Section "7.1.10 String" of the spec describes how UCS code points
are encoded in EXI format.
http://www.w3.org/TR/exi/#encodingString

Please take a look at the section and let me know if you have
any questions.

Hope it helps,

-taki




--------------------------------------------------------------------------------
From: public-exi-request@w3.org [mailto:public-exi-request@w3.org] On Behalf Of Melanie Stallings
Sent: Friday, May 23, 2008 12:37 PM
To: public-exi@w3.org
Subject: RE: character encoding
>
>
> Taki,
>
> Thank you for your response.  I still have some questions.
>
> About the Universal Character Set, Wikipedia says: "The Universal Character Set
> (UCS) is defined by the ISO/IEC 10646 International Standard as a character set
> on which many encodings are based."
>
> My question revolves around the "on which many encodings are based" portion of
> that sentence.  As far as I can tell UCS does not contain any information on how
>  each character is to be encoded.
>
> About the ISO 10646 Wikipedia says: "ISO 10646 defines several character encoding
> forms for the Universal Character Set."  Some examples listed are USC-2, UCS-4,
> UTF-8 and UTF-16.
>
> In your email you referred to the Unicode Character Database, but not a specific
> encoding scheme.  About Unicode Wikipedia says: "Unicode can be implemented by
> different character encodings.  The most commonly used encodings are UTF-8
> (which uses 1 byte for all ASCII characters, which have the same code values as
> in the standard ASCII encoding, and up to 4 bytes for other characters)".
>
> I have to laugh at myself.  It's likely that Wikipedia may not be the best source
> for my "education" on character set / character encoding.
>
> I had my fingers crossed that your answer to my original question would be
> "Ah, yes.  Use the UTF-8 encoding."  That's what we use internally in our product.
> Oh how convenient that would be!
>
> I hope I am effectively communicating my question.  Please help set me straight.
>
> Sincerely,
>
> Melanie
>
Received on Thursday, 19 June 2008 23:20:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 October 2008 18:12:36 GMT