[whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]

On Sun, 12 Apr 2009, ?istein E. Andersen wrote:
> On 2 Sep 2008, at 06:06, Ian Hickson wrote:
> > On Wed, 30 Jul 2008, ?istein E. Andersen wrote:
> > > 
> > > 1. Opera, Firefox and Safari all handle US-ASCII as Windows-1252.
> > >    IE7, on the other hand, simply ignores the high bit (as it does for
> > >    a few other 7-bit encodings, by the way).  Perhaps this
> > >    alias could be dropped from the other browsers.
> > 
> > Ignoring the high bit seems like a dangerous security bug; dropping any
> > character with a high bit as U+FFFD seems unnecessarily drastic.
> According to a test I did using browsershots.org, IE8 actually seems to do
> this (8-bit characters are rendered as squares), which looks like an argument
> in favour of the more `drastic' option.
> > I've made the spec go with the O/F/S behaviour here.
> This has the advantage of not adding ASCII as a separate encoding, and
> Windows-1252 is presumably one of the encodings most often mislabelled as
> ASCII.  However, IE has ignored the high bit at least since 5.01 (IE4 via
> browsershots.org treats it as CP1252, but this could well be
> locale-dependent), so there may not be that many mislabelled pages.  Has
> anyone got a list of pages which are labelled as ASCII and contain 8-bit
> characters?
> This is probably not very important.  U+FFFD is `purer', Windows-1252 has the
> potential of rescuing a few pages.  It is however essential that 8-bit
> characters be considered not conforming since they do not in fact work (as
> Windows-1252 bytes) in IE5-IE8.  This is currently the case, but I think Henri
> Sivonen has argued that `misinterpretation for compatibility' should not be
> considered a conformance error (which would probably be fairly harmless for
> other mappings).

I (and the spec) agree with you here, that these should be reported as 

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Wednesday, 3 June 2009 14:58:08 UTC