W3C home > Mailing lists > Public > www-international@w3.org > October to December 2007

Re: For review: Character encodings for beginners

From: Martin Duerst <duerst@it.aoyama.ac.jp>
Date: Tue, 11 Dec 2007 18:00:00 +0900
Message-Id: <>
To: Douglas Bagnall <douglas@paradise.net.nz>, www-international@w3.org

At 08:29 07/12/07, Douglas Bagnall wrote:
>Frank Ellermann wrote:

>> You'd need a definition of the shorthand "charset" first,
>That could be at
>- Characters are grouped into a *character* *set* (also called a
>- *repertoire*),
>+ Characters are grouped into a *character* *set* (also called a
>+ *repertoire* or *charset*),

No, sorry, wrong, a "charset" includes the coding mechanism
down to the bit/byte level.

>> | Most Web pages use the UTF-8 encoding for Unicode text.
>> Are you sure about "most Web pages" (as of today) ?
>This evoked a double take from me, too.  I had to re-read to see that
>"for Unicode text" was making a much smaller claim than I first thought.
>In the sense in which it is meant, however (UTF-8 is more common than
>UTF-[7,16,32] variants), it seems very likely true.

Somewhat similar for me, too. I'm sure that we can tweak the wording
so that it's easier to read.

Regards,    Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     
Received on Tuesday, 11 December 2007 09:01:04 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:55 UTC