RE: ISO-8859-1 from Yves Arrouye on 2001-10-01 (www-international@w3.org from October to December 2001)

From: Yves Arrouye <yves@realnames.com>
Date: Mon, 1 Oct 2001 15:18:19 -0700
To: www-international@w3.org
Message-ID: <7FC3066C236FD511BC5900508BAC86FE1E6258@trestles.internal.realnames.com>

In HTML and XML, character encoding forms and character set (= Unicode) are
decoupled.
As a result, in any character encoding form, it is always possible to access
the whole range of Unicode characters.
For instance, with iso-8859-1 encoding form, I can encode any Unicode
character by using NCR like &#xFF7D; for the Japanese character ス.
[YA] Whilst every browser accepts this in the HTML, it is an Internet
Explorer peculiarity to convert non-transcodable characters in forms
*submissions* to NCRs. For instance, if I type Lad¥u00E9d¥u00E9 (where
¥u00E9 represents a lowercased “e” with acute accent) in a form expecting
Shift_JIS, IE will send Lad&233;d&233; whilst Netscape may send for example
Lad?d? where the “?” are the result of trying to convert ¥u00E9 to
Shift_JIS.
YA

Received on Monday, 1 October 2001 18:22:29 UTC