- From: Ian Hickson <ian@hixie.ch>
- Date: Sun, 30 Aug 2009 01:47:34 +0000 (UTC)
On Wed, 19 Aug 2009, Anne van Kesteren wrote: > > Today every browser implements their own encoding label matching > algorithm, supports their own list of encodings, their own list of > encoding label aliases, and everything sort of works, but not really. > > HTML5 solves part of this problem by defining exactly how to identify an > encoding label alias in a text/html stream. It also defines which > encoding label matching algorithm to use, UTS22, but we found out that > this is incompatible with (existing) sites that specify EUC_JP at the > HTTP level and actually want to be decoded per UTF-8 according to a > <meta> in the text/html stream. This works fine if you have a strict > encoding label matching algorithm, but with UTS22, EUC_JP and EUC-JP > become the same thing, while only the latter is the actual encoding > label. I've backed off UTS22. I think we need the IANA list updated, though, to include the aliases browsers support. I understand you are working on this? I would like to remove the table in the HTML5 spec that defines such mappings, once that is done. > Another problem HTML5 does not solve is giving a definitive list of > encodings clients have to implement to be compatible with a large body > of Web content. This means new clients will have to reverse engineer > that list from existing clients which I think is bad. If you can get browser vendors to agree on a comprehensive and accurate list, I'm happy to add it to the spec. But unless a plurality of browser vendors actually decide to standardise on a single set of encodings, I don't know that it makes sense to spec something here. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Saturday, 29 August 2009 18:47:34 UTC