W3C home > Mailing lists > Public > www-international@w3.org > January to March 2008

Re: For review: Character encodings for beginners

From: Gunnar Bittersmann <gunnar.bittersmann@web.de>
Date: Fri, 25 Jan 2008 02:56:38 +0100
Message-ID: <479941D6.8060401@web.de>
To: www-international@w3.org

| For example, in the character set called ISO 8859-1 (also known as
Latin1) […]

Isn’t ISO 8859-1 a character encoding; but not a character set? IMHO
these two terms should be clearly distinguished; the wrong usage leeds
to many misunderstandings.

My suggestion: delete “ISO 8859-1”:

For example, in the character set called Latin1 […]


| The Devanagari character क, with codepoint 2325, can be represented by
two bytes (09 15), three bytes (E0 A4 95), or four bytes (00 00 09 15), […]

Suggestion: codepoint also as hexadecimal number; that would make it
easier to understand the UTF-16 and UTF-32 encoding:

The Devanagari character क, with codepoint 2325 (hexadecimal 915,
usually referred to as U+0915), can be represented by two bytes (09 15),
three bytes (E0 A4 95), or four bytes (00 00 09 15), […]

My .02 €,
Gunnar
Received on Friday, 25 January 2008 01:58:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:16 GMT