- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 5 Aug 2009 00:01:59 +0000 (UTC)
On Wed, 29 Jul 2009, Aryeh Gregor wrote: > On Wed, Jul 29, 2009 at 4:39 AM, Ian Hickson<ian at hixie.ch> wrote: > > > > Which others are needed for compatibility? > > I don't know, but there are certainly some. Otherwise, why would > browsers support so many? I'm pretty sure that character encoding support in browsers is more of a "collect them all" kind of thing than really based on content that requires it, to be honest. > For instance, baidu.com is #9 on Alexa and serves gb2312 as far as I can > tell. So does qq.com, which is #14. And sina.com.cn, #19. > vkontakte.ru is #30 and serves Windows-1251. tudou.com (#60) uses gbk. > rakuten.co.jp (#68) serves EUC-JP. > > This is just from a quick manual look at a few of the largest > non-English sites. I'd think it would be fairly easy for someone (e.g., > Google) to come up with a rough summary of character encoding usage on > the web by percentage, and for vendors to say which encodings they > support, so a useful common list could be worked out. > > If browsers differ in which encodings they accept, that harms > interoperability, so I'd think it would be ideal if HTML 5 would specify > the exact list of encodings that must be supported and prohibited > support for any others. The union of encodings supported by existing > browsers would be a reasonable start, since supporting a new encoding is > presumably pretty cheap. Unless this is viewed as outside the scope of > HTML 5 -- e.g., if browsers tend to rely on the operating system for > encoding support. If someone can provide a firm list of encodings that they are confident are required for a certain substantial percentage of the Web, I'm happy to add the list to the spec. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 4 August 2009 17:01:59 UTC