- From: Řistein E. Andersen <html5@xn--istein-9xa.com>
- Date: Tue, 05 Jun 2007 16:11:35 +0200
On Jun 5, 2007, at 11:38, Kristof Zelechovski wrote: > And why not:? > 2c) If the declared encoding was ISO-8859-2, replace that > character with the [correponding] character [... from] Windows-1250. On Jun 5, 2007, at 11:51, Henri Sivonen wrote: > that's not what [browsers] do, so apparently it is not > required for compatibility A more fundamental reason is that the two encodings are incompatible. Amongst the nine 9 Windows-125* encodings, 8 have ISO-8859-* counterparts, of which 4 are subsets of the corresponding Windows-125* encoding: Windows-1250 vs. ISO-8859-2 (Eastern European): The range 0xC0--0xFF is the same in both encodings, but 0xA0--0xBF, which does include letters, is different. Windows-1251 vs. ISO-8859-5 (Cyrillic): Completely incompatible. Most notably, Cyrillic letters from the modern Russian alphabet (32 uppercase and 32 lowercase) are shifted by 0x10. Windows-1252 vs. ISO-8859-1 (Western European): Superset. Windows-1253 vs. ISO-8859-7 (Greek): Almost compatible. Unfortunately, a few bytes in the range 0xA0--0xBF are assigned to different characters, and the accented capital Alpha is positioned differently. Windows-1254 vs. ISO-8859-9 (Turkish): Superset. Windows-1255 vs. ISO-8859-8 (Hebrew): Superset. Windows-1256 vs. ISO-8859-6 (Arabic): Arabic consonants seem to have the same code points, but vowels have incompatible positions. Windows-1256 contains lowercase French accented characters and even the oe ligature, whereas ISO-8859 leaves many bytes undefined. Windows-1257 vs. ISO-8859-13 (Baltic): Superset. Windows-1258 (Vietnamese): No corresponding ISO-8859-* encoding. -- ?istein E. Andersen
Received on Tuesday, 5 June 2007 07:11:35 UTC