W3C home > Mailing lists > Public > www-international@w3.org > October to December 2007

Re: For review: Character encodings for beginners

From: Douglas Bagnall <douglas@paradise.net.nz>
Date: Fri, 07 Dec 2007 12:29:47 +1300
To: www-international@w3.org
Message-id: <475885EB.30201@paradise.net.nz>

Frank Ellermann wrote:

> ... "represent é, щ, other characters, or no character at
> all depending on the charset".
> 
> You'd need a definition of the shorthand "charset" first,

That could be at

- Characters are grouped into a *character* *set* (also called a
- *repertoire*),

+ Characters are grouped into a *character* *set* (also called a
+ *repertoire* or *charset*),

> | Most Web pages use the UTF-8 encoding for Unicode text.
[...] 
> Are you sure about "most Web pages" (as of today) ?

This evoked a double take from me, too.  I had to re-read to see that
"for Unicode text" was making a much smaller claim than I first thought.
In the sense in which it is meant, however (UTF-8 is more common than
UTF-[7,16,32] variants), it seems very likely true.


Douglas
Received on Saturday, 8 December 2007 18:48:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:15 GMT