W3C home > Mailing lists > Public > www-international@w3.org > October to December 2007

Re: For review: Character encodings for beginners

From: Douglas Bagnall <douglas@paradise.net.nz>
Date: Fri, 07 Dec 2007 12:29:47 +1300
To: www-international@w3.org
Message-id: <475885EB.30201@paradise.net.nz>

Frank Ellermann wrote:

> ... "represent é, щ, other characters, or no character at
> all depending on the charset".
> You'd need a definition of the shorthand "charset" first,

That could be at

- Characters are grouped into a *character* *set* (also called a
- *repertoire*),

+ Characters are grouped into a *character* *set* (also called a
+ *repertoire* or *charset*),

> | Most Web pages use the UTF-8 encoding for Unicode text.
> Are you sure about "most Web pages" (as of today) ?

This evoked a double take from me, too.  I had to re-read to see that
"for Unicode text" was making a much smaller claim than I first thought.
In the sense in which it is meant, however (UTF-8 is more common than
UTF-[7,16,32] variants), it seems very likely true.

Received on Saturday, 8 December 2007 18:48:52 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:55 UTC