Re: For review: Character encodings for beginners

Frank Ellermann wrote:

> ... "represent é, щ, other characters, or no character at
> all depending on the charset".
> 
> You'd need a definition of the shorthand "charset" first,

That could be at

- Characters are grouped into a *character* *set* (also called a
- *repertoire*),

+ Characters are grouped into a *character* *set* (also called a
+ *repertoire* or *charset*),

> | Most Web pages use the UTF-8 encoding for Unicode text.
[...] 
> Are you sure about "most Web pages" (as of today) ?

This evoked a double take from me, too.  I had to re-read to see that
"for Unicode text" was making a much smaller claim than I first thought.
In the sense in which it is meant, however (UTF-8 is more common than
UTF-[7,16,32] variants), it seems very likely true.


Douglas

Received on Saturday, 8 December 2007 18:48:52 UTC