W3C home > Mailing lists > Public > www-international@w3.org > July to September 2000

Re: Re: Query on character sets

From: Martin J. Duerst <duerst@w3.org>
Date: Tue, 26 Sep 2000 18:46:27 +0900
Message-Id: <4.2.0.58.J.20000926184542.00d1ad90@sh.w3.mag.keio.ac.jp>
To: Simon Leung <simonleung@hyperoffice.com> (by way of "Martin J. Duerst" <duerst@w3.org>), www-international@w3.org
Simon,

I'm not sure I understand you right. Can you point to an example
of your pages somewhere on the web? That would make sure we speak
about the same thing.

Regards,   Martin.

At 00/09/26 17:16 +0900, Simon  Leung wrote:
>It's much clear now.
>So if encode a page containing BIG5 character sets using UNICODE,
>then i should use UNICODE to decode the page and the page will be
>converted to contain UNICODE character sets ?
>
>Thanks,
>Simon
>
>On Tue, 26 Sep 2000 at 09:43:46 PM, "Martin J. Duerst" wrote:
>
> > Hello Simon,
> >
> > I think you have to distinguish between subsets respective to the
> > character repertoire and subsets respective to the character encoding.
> >
> > As an example, the repertoire (set of characters) that can be represented
> > by Big5 is a subset of the repertoire of UTF-8. You can therefore convert
> > a file from Big5 to UTF-8 without loosing any characters.
> >
> > On the other hand, the Big5 encoding is completely different from the
> > UTF-8 encoding, and if you try to decode a Big5 file as UTF-8, you
> > may see garbage, but you actually should get an error message.
> >
> > Hope this helps.    Regards,   Martin.
> >
> > At 00/09/25 14:00 +0000, Misha Wolf wrote:
> > >Please respond to the questions below, copying both the list and
> > >(simonleung@hyperoffice.com).
> > >
> > >Thanks,
> > >Misha
> > >
> > >[This mail was written using voice recognition software]
> > >
> > >
> > > > Dear sir,
> > > >
> > > > I've got a question regarding the page
> > > > 'http://www.unicode.org/iuc/iuc10/languages.html'.
> > > >
> > > > Please advise if the following statement are right or not.
> > > >
> > > > 1. Every page coded in ASCII can be viewed by the browers using 
> different
> > > > encoding scheme since ASCII is the subset of all the character set.
> > > >
> > > > 2. So, does it mean that if page encoded using encoding scheme 'A' can
> > > be viewed
> > > > properly by the browers using encoding scheme 'B'if 'A' is a subset of
> > > 'B' ?
> > > >
> > > >
> > > > However, when i use UTF-8 to decode the page which use the charset
> > > 'BIG5', i can
> > > > only observe the garbage.
> > > >
> > > > Many thanks,
> > > > Simon
> > >
> > >
> > >
> > >-----------------------------------------------------------------
> > >         Visit our Internet site at http://www.reuters.com
> > >
> > >Any views expressed in this message are those of  the  individual
> > >sender,  except  where  the sender specifically states them to be
> > >the views of Reuters Ltd.
> >
> >
>
>
Received on Tuesday, 26 September 2000 06:02:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:55 GMT