Re: For review: Character encodings for beginners from Gunnar Bittersmann on 2008-01-25 (www-international@w3.org from January to March 2008)

From: Gunnar Bittersmann <gunnar.bittersmann@web.de>
Date: Fri, 25 Jan 2008 02:56:38 +0100
To: www-international@w3.org
Message-ID: <479941D6.8060401@web.de>

| For example, in the character set called ISO 8859-1 (also known as
Latin1) […]

Isn’t ISO 8859-1 a character encoding; but not a character set? IMHO
these two terms should be clearly distinguished; the wrong usage leeds
to many misunderstandings.

My suggestion: delete “ISO 8859-1”:

For example, in the character set called Latin1 […]


| The Devanagari character क, with codepoint 2325, can be represented by
two bytes (09 15), three bytes (E0 A4 95), or four bytes (00 00 09 15), […]

Suggestion: codepoint also as hexadecimal number; that would make it
easier to understand the UTF-16 and UTF-32 encoding:

The Devanagari character क, with codepoint 2325 (hexadecimal 915,
usually referred to as U+0915), can be represented by two bytes (09 15),
three bytes (E0 A4 95), or four bytes (00 00 09 15), […]

My .02 €,
Gunnar

Received on Friday, 25 January 2008 01:58:48 UTC