Re: HTML5 Issue 11 (encoding detection): I18N WG response...

On Aug 20, 2009, at 10:06, Phillips, Addison wrote:

> I think the world has changed significantly. In the past, setting a  
> default of UTF-8 in your browser produced mainly bad results. But,  
> at least according to some measures [1], UTF-8 is rapidly becoming  
> the most reasonable default encoding on the Web.
[...]
> [1] http://googleblog.blogspot.com/2008/05/moving-to-unicode-51.html

This shows an uptake in UTF-8, but it proves nothing without data on  
how much is labeled and how much unlabeled. Uptake in labeled UTF-8 is  
awesome but doesn't affect what makes sense as the default processing  
for unlabeled data.

> At the same time, I think UTF-8 is more than a politically correct  
> fig leaf. The more standards and implementations stress good  
> choices, the more likely people (users, content authors) are to take  
> them seriously. If you happen to have chosen UTF-8 as an encoding,  
> your pages are more likely to just work. Recommending UTF-8 as a  
> default probably will continue to establish itself as the right  
> choice as time progresses. Remember: this is the "all else fails"  
> result and is exposed to user intervention by nearly all user agents.

HTML 5 already recommends (labeled) UTF-8 as the default for authoring  
tools.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Thursday, 20 August 2009 07:15:28 UTC