Re: For review: Character encodings for beginners

| For example, in the character set called ISO 8859-1 (also known as
Latin1) […]

Isn’t ISO 8859-1 a character encoding; but not a character set? IMHO
these two terms should be clearly distinguished; the wrong usage leeds
to many misunderstandings.

My suggestion: delete “ISO 8859-1”:

For example, in the character set called Latin1 […]


| The Devanagari character क, with codepoint 2325, can be represented by
two bytes (09 15), three bytes (E0 A4 95), or four bytes (00 00 09 15), […]

Suggestion: codepoint also as hexadecimal number; that would make it
easier to understand the UTF-16 and UTF-32 encoding:

The Devanagari character क, with codepoint 2325 (hexadecimal 915,
usually referred to as U+0915), can be represented by two bytes (09 15),
three bytes (E0 A4 95), or four bytes (00 00 09 15), […]

My .02 €,
Gunnar

Received on Friday, 25 January 2008 01:58:48 UTC