- From: Řistein E. Andersen <html5@xn--istein-9xa.com>
- Date: Sat, 02 Jun 2007 14:58:23 +0200
On 29 May 2007, at 11:13AM, Henri Sivonen wrote: > Surely there are other ISO-8859 family encodings besides ISO-8859-1 > that require decoding using the corresponding Windows-* family decoder? For the following reasons, this is not entirely obvious: 1) Several of the windows-* encodings are more or less incompatible with (i.e., they are not a superset of) the corresponding ISO-8859-* encoding; 2) Only ISO-8859-1 enjoyed a privileged position as standard HTML encoding; 3) Windows-1252 was registered in IANA at a later time than the other Windows-* encodings. On 1 Jun 2007, at 8:57AM, Henri Sivonen wrote: > 2) 0x85 in ISO-8859-10 and in ISO-8859-16 is decoded as in Windows-1252 > (ellipsis) by Gecko. I am unable to reproduce this in Firefox (1.5 Mac, 2.0 Unix, 3.0 Mac). However, C1 characters in ISO-8859-10 and ISO-8859-16 are /not/ converted to U+FFFD, and this may give the reported result with an incorrectly encoded font containing the ellipsis at unicode U+0085. (Cf. http://html5.ouvaton.org/iso-8859-16.png for an example of this with accented small capitals as intruders). Would this be the explanation? On 29 May 2007, at 4:10PM, Maciej Stachowiak wrote: > for all unicode encodings and numeric entity references compatibility requires > interpreting this range of code points in the WinLatin1 way. On 1 Jun 2007, at 8:57AM, Henri Sivonen wrote: > 1) ISO-8859-1 is decoded as Windows-1252. > 3) ISO-8859-11 is decoded as Windows-874. > I suggest adding the ISO-8859-11 to Windows-874 mapping to the spec. 1) The C1-range characters defined in Windows-874 seem to be a subset of those defined in Windows-1252; 2) Safari and IE5.5/Mac treat C1 characters from all (supported) ISO-8859-* characters as Windows-1252; 3) IE7 does the same for a certain number of selected ISO-8859-* encodings. As suggested earlier [1], a simpler solution seems to be to treat C1 bytes and NCRs from /all/ ISO-8859-* and Unicode encodings as Windows-1252. [1] http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2006-November/007804.html -- ?istein E. Andersen
Received on Saturday, 2 June 2007 05:58:23 UTC