- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 3 Jun 2009 21:58:08 +0000 (UTC)
On Sun, 12 Apr 2009, ?istein E. Andersen wrote: > On 2 Sep 2008, at 06:06, Ian Hickson wrote: > > > On Wed, 30 Jul 2008, ?istein E. Andersen wrote: > > > > > > 1. Opera, Firefox and Safari all handle US-ASCII as Windows-1252. > > > IE7, on the other hand, simply ignores the high bit (as it does for > > > a few other 7-bit encodings, by the way). Perhaps this > > > alias could be dropped from the other browsers. > > > > Ignoring the high bit seems like a dangerous security bug; dropping any > > character with a high bit as U+FFFD seems unnecessarily drastic. > > According to a test I did using browsershots.org, IE8 actually seems to do > this (8-bit characters are rendered as squares), which looks like an argument > in favour of the more `drastic' option. > > > I've made the spec go with the O/F/S behaviour here. > > This has the advantage of not adding ASCII as a separate encoding, and > Windows-1252 is presumably one of the encodings most often mislabelled as > ASCII. However, IE has ignored the high bit at least since 5.01 (IE4 via > browsershots.org treats it as CP1252, but this could well be > locale-dependent), so there may not be that many mislabelled pages. Has > anyone got a list of pages which are labelled as ASCII and contain 8-bit > characters? > > This is probably not very important. U+FFFD is `purer', Windows-1252 has the > potential of rescuing a few pages. It is however essential that 8-bit > characters be considered not conforming since they do not in fact work (as > Windows-1252 bytes) in IE5-IE8. This is currently the case, but I think Henri > Sivonen has argued that `misinterpretation for compatibility' should not be > considered a conformance error (which would probably be fairly harmless for > other mappings). I (and the spec) agree with you here, that these should be reported as errors. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 3 June 2009 14:58:08 UTC