W3C home > Mailing lists > Public > www-international@w3.org > October to December 2007

Re: For review: Character encodings for beginners

From: Martin Duerst <duerst@it.aoyama.ac.jp>
Date: Tue, 11 Dec 2007 18:00:00 +0900
Message-Id: <6.0.0.20.2.20071211174425.0a12b3f0@localhost>
To: Douglas Bagnall <douglas@paradise.net.nz>, www-international@w3.org

At 08:29 07/12/07, Douglas Bagnall wrote:
>
>Frank Ellermann wrote:

>> You'd need a definition of the shorthand "charset" first,
>
>That could be at
>
>- Characters are grouped into a *character* *set* (also called a
>- *repertoire*),
>
>+ Characters are grouped into a *character* *set* (also called a
>+ *repertoire* or *charset*),

No, sorry, wrong, a "charset" includes the coding mechanism
down to the bit/byte level.

>> | Most Web pages use the UTF-8 encoding for Unicode text.
>[...] 
>> Are you sure about "most Web pages" (as of today) ?
>
>This evoked a double take from me, too.  I had to re-read to see that
>"for Unicode text" was making a much smaller claim than I first thought.
>In the sense in which it is meant, however (UTF-8 is more common than
>UTF-[7,16,32] variants), it seems very likely true.

Somewhat similar for me, too. I'm sure that we can tweak the wording
so that it's easier to read.

Regards,    Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     
Received on Tuesday, 11 December 2007 09:01:04 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:15 GMT