[whatwg] Spec comments, sections 1-2

On Wed, 29 Jul 2009, Aryeh Gregor wrote:
> On Wed, Jul 29, 2009 at 4:39 AM, Ian Hickson<ian at hixie.ch> wrote:
> > 
> > Which others are needed for compatibility?
> 
> I don't know, but there are certainly some.  Otherwise, why would 
> browsers support so many?

I'm pretty sure that character encoding support in browsers is more of a 
"collect them all" kind of thing than really based on content that 
requires it, to be honest.


> For instance, baidu.com is #9 on Alexa and serves gb2312 as far as I can 
> tell.  So does qq.com, which is #14. And sina.com.cn, #19.  
> vkontakte.ru is #30 and serves Windows-1251. tudou.com (#60) uses gbk.  
> rakuten.co.jp (#68) serves EUC-JP.
> 
> This is just from a quick manual look at a few of the largest 
> non-English sites.  I'd think it would be fairly easy for someone (e.g., 
> Google) to come up with a rough summary of character encoding usage on 
> the web by percentage, and for vendors to say which encodings they 
> support, so a useful common list could be worked out.
> 
> If browsers differ in which encodings they accept, that harms 
> interoperability, so I'd think it would be ideal if HTML 5 would specify 
> the exact list of encodings that must be supported and prohibited 
> support for any others.  The union of encodings supported by existing 
> browsers would be a reasonable start, since supporting a new encoding is 
> presumably pretty cheap.  Unless this is viewed as outside the scope of 
> HTML 5 -- e.g., if browsers tend to rely on the operating system for 
> encoding support.

If someone can provide a firm list of encodings that they are confident 
are required for a certain substantial percentage of the Web, I'm happy to 
add the list to the spec.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 4 August 2009 17:01:59 UTC