- From: Martin J. Duerst <duerst@w3.org>
- Date: Tue, 26 Sep 2000 13:43:46 +0900
- To: Misha Wolf <misha.wolf@reuters.com>, www-international@w3.org, simonleung@hyperoffice.com
Hello Simon, I think you have to distinguish between subsets respective to the character repertoire and subsets respective to the character encoding. As an example, the repertoire (set of characters) that can be represented by Big5 is a subset of the repertoire of UTF-8. You can therefore convert a file from Big5 to UTF-8 without loosing any characters. On the other hand, the Big5 encoding is completely different from the UTF-8 encoding, and if you try to decode a Big5 file as UTF-8, you may see garbage, but you actually should get an error message. Hope this helps. Regards, Martin. At 00/09/25 14:00 +0000, Misha Wolf wrote: >Please respond to the questions below, copying both the list and >(simonleung@hyperoffice.com). > >Thanks, >Misha > >[This mail was written using voice recognition software] > > > > Dear sir, > > > > I've got a question regarding the page > > 'http://www.unicode.org/iuc/iuc10/languages.html'. > > > > Please advise if the following statement are right or not. > > > > 1. Every page coded in ASCII can be viewed by the browers using different > > encoding scheme since ASCII is the subset of all the character set. > > > > 2. So, does it mean that if page encoded using encoding scheme 'A' can > be viewed > > properly by the browers using encoding scheme 'B'if 'A' is a subset of > 'B' ? > > > > > > However, when i use UTF-8 to decode the page which use the charset > 'BIG5', i can > > only observe the garbage. > > > > Many thanks, > > Simon > > > >----------------------------------------------------------------- > Visit our Internet site at http://www.reuters.com > >Any views expressed in this message are those of the individual >sender, except where the sender specifically states them to be >the views of Reuters Ltd.
Received on Tuesday, 26 September 2000 00:56:26 UTC