Apologies for the delayed answer; I hope the following is helpful. IE trims leading and trailing spaces from the encoding name then does a lowercase match on one of the aliases listed in ie.encodings.txt (attached). ie.encodings.txt lists all the encodings using the format: <ui-label>,<alias>,<codepage>,<msdn-cp-identifier> Where: <ui-label> is the encoding name as reported in the Page-Encoding menu of IE8 RTM (us-en version). <alias> is an encoding name that maps to <ui-label>; the mapping is lowercase match of the input after trimming leading and trailing spaces. <codepage> is the codepage number for this encoding. <msdn-cp-identifier> is the description of the code page from http://msdn.microsoft.com/en-us/library/dd317756(VS.85).aspx In addition, as it might be helpful for future spec work, I've also attached a flat version of the IANA character set assignments at http://www.iana.org/assignments/character-sets The iana.charsets.map.txt file enumerates the charsets using the format: <IANA-name>,<alias> > -----Original Message----- > From: public-html-comments-request@w3.org [mailto:public-html-comments- > request@w3.org] On Behalf Of Anne van Kesteren > Sent: Monday, August 17, 2009 12:33 PM > To: Erik van der Poel > Cc: Ian Hickson; public-html-comments@w3.org > Subject: Re: charset name matching rules > > On Mon, 17 Aug 2009 17:26:52 +0200, Erik van der Poel > <erikv@google.com> > wrote: > > I stopped testing when MSIE's tests said "script did not run" so > > often. We probably need to test it differently, instead of relying on > > a script. > > Or maybe update your IE to a newer version? > > http://krijnhoetmer.nl/irc-logs/whatwg/20090817#l-604 has the results > for > IE8. In summary it seems IE8 only does whitespace trimming at start and > end and has ISO_8859-9 and ISO-8859_9 as alias but not ISO_8859_9. It > also > treats ISO-8859-9 as Windows-1254 which makes sense I suppose and we > should probably require that. > > I think the main issue with following the IE/Gecko algorithm is that > although it is much stricter it relies on more undefined aliases as > well, > such as ISO-8859_9. So getting documentation from the IE Team and Gecko > guys on that would be good. (Have not checked whether Gecko actually > recognizes that alias, fwiw.) > > > -- > Anne van Kesteren > http://annevankesteren.nl/ >
This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 June 2011 00:14:00 GMT