- From: Yves Arrouye <yves@realnames.com>
- Date: Mon, 22 Oct 2001 00:11:19 -0700
- To: "'Shigemichi Yazawa'" <yazawa@globalsight.com>, www-international@w3.org
> Yes, two wrong conversions make a right result, However, Cp1252 > doesn't always work this way. Cp1252 <-> Unicode mapping table > includes 5 undefined entries. If you pass 0x81, for example, to byte > to char converter, it is converted to U+fffd (REPLACEMENT CHARACTER) > and the round trip is not possible. Only ISO-8859-1 is the safe, round > trippable encoding as far as I know. Isn't ISO-8859-1 actually the one that has "holes" in C0/C1 that exhibit this very behavior? I thought that was the case, and windows-1252 was the one that used C1 for platform-specific character (see http://www-124.ibm.com/cvs/icu/charset/data/xml/windows-1252-2000.xml?rev=1. 1&content-type=text/x-cvsweb-markup where apparently U+0081 is mapped to 0x81 in windows-1252). YA
Received on Monday, 22 October 2001 03:15:37 UTC