- From: Taki Kamiya <tkamiya@us.fujitsu.com>
- Date: Wed, 14 May 2008 15:52:22 -0700
- To: "'Melanie Stallings'" <ms.protrain@yahoo.com>, <public-exi@w3.org>
Hi Melanie, UCS is the set of characters defined by ISO/IEC 10646, and the characters in UCS are the particles that make up any text XML documents. It has more than a million characters in it and each character in UCS is given a serial number called "code point" to uniquely identify that number. As a starter, you can visit the following page to see what's in UCS and what code points are assigned to the characters. http://www.fileformat.info/info/unicode/block/index.htm For the nominal definition of the UCS, please take a look at the unicode character database, which you can find at: http://www.unicode.org/Public/5.0.0/ucd/UCD.html The database itself is at: http://www.unicode.org/Public/5.0.0/ucd/UnicodeData.txt We'll add a reference to ISO/IEC 10646 to describe UCS in the spec for the next publication. Thanks for asking this question. Thanks! -taki ________________________________ From: Melanie Stallings [mailto:ms.protrain@yahoo.com] Sent: Tuesday, May 06, 2008 7:13 AM To: public-exi@w3.org Subject: Dear EXI working group: My question is about character encoding. Section 7.1.10 String states "each character is represented by its UCS code point encoded as an Unsigned Integer". Can you please be more specific about what you mean by UCS code point? I'm currently encoding and decoding in my own little world (because I know what encoding scheme I'm using), but, the intent is universal. I want to be sure others can decode my EXI output. I want to be sure I can decode EXI from other sources. Links to documentation on how to encode / decode UCS code point would be helpful and appreciated. Sincerely, Melanie Stallings ________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. <http://us.rd.yahoo.com/evt=51733/*http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ>
Received on Wednesday, 14 May 2008 22:53:09 UTC