W3C home > Mailing lists > Public > www-international@w3.org > July to September 2000

Re: Re: Query on character sets

From: by way of <simonleung@hyperoffice.com>
Date: Tue, 26 Sep 2000 17:16:25 +0900
Message-Id: <4.2.0.58.J.20000926171616.0313fc90@sh.w3.mag.keio.ac.jp>
To: www-international@w3.org
Thanks indeed,Misha.

It's much clear now.
So if encode a page containing BIG5 character sets using UNICODE,
then i should use UNICODE to decode the page and the page will be
converted to contain UNICODE character sets ?

Thanks,
Simon

On Tue, 26 Sep 2000 at 09:43:46 PM, "Martin J. Duerst" wrote:

 > Hello Simon,
 >
 > I think you have to distinguish between subsets respective to the
 > character repertoire and subsets respective to the character encoding.
 >
 > As an example, the repertoire (set of characters) that can be represented
 > by Big5 is a subset of the repertoire of UTF-8. You can therefore convert
 > a file from Big5 to UTF-8 without loosing any characters.
 >
 > On the other hand, the Big5 encoding is completely different from the
 > UTF-8 encoding, and if you try to decode a Big5 file as UTF-8, you
 > may see garbage, but you actually should get an error message.
 >
 > Hope this helps.    Regards,   Martin.
 >
 > At 00/09/25 14:00 +0000, Misha Wolf wrote:
 > >Please respond to the questions below, copying both the list and
 > >(simonleung@hyperoffice.com).
 > >
 > >Thanks,
 > >Misha
 > >
 > >[This mail was written using voice recognition software]
 > >
 > >
 > > > Dear sir,
 > > >
 > > > I've got a question regarding the page
 > > > 'http://www.unicode.org/iuc/iuc10/languages.html'.
 > > >
 > > > Please advise if the following statement are right or not.
 > > >
 > > > 1. Every page coded in ASCII can be viewed by the browers using 
different
 > > > encoding scheme since ASCII is the subset of all the character set.
 > > >
 > > > 2. So, does it mean that if page encoded using encoding scheme 'A' can
 > > be viewed
 > > > properly by the browers using encoding scheme 'B'if 'A' is a subset of
 > > 'B' ?
 > > >
 > > >
 > > > However, when i use UTF-8 to decode the page which use the charset
 > > 'BIG5', i can
 > > > only observe the garbage.
 > > >
 > > > Many thanks,
 > > > Simon
 > >
 > >
 > >
 > >-----------------------------------------------------------------
 > >         Visit our Internet site at http://www.reuters.com
 > >
 > >Any views expressed in this message are those of  the  individual
 > >sender,  except  where  the sender specifically states them to be
 > >the views of Reuters Ltd.
 >
 >
Received on Tuesday, 26 September 2000 04:21:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:55 GMT