W3C home > Mailing lists > Public > whatwg@whatwg.org > November 2011

[whatwg] Default encoding to UTF-8?

From: L. David Baron <dbaron@dbaron.org>
Date: Wed, 30 Nov 2011 22:00:10 -0800
Message-ID: <20111201060010.GA306@pickering.dbaron.org>
On Thursday 2011-12-01 14:37 +0900, Mark Callow wrote:
> On 01/12/2011 11:29, L. David Baron wrote:
> > The default varies by localization (and within that potentially by
> > platform), and unfortunately that variation does matter.
> In my experience this is what causes most of the breakage. It leads
> people to create pages that do not specify the charset encoding. The
> page works fine in the creator's locale but shows mojibake (garbage
> characters) for anyone in a different locale.
> 
> If the default was ASCII everywhere then all authors would see mojibake,
> unless it really was an ASCII-only page, which would force them to set
> the charset encoding correctly.

Sure, if the default were consistent everywhere we'd be fine.  If we
have a choice in what that default is, UTF-8 is probably a good
choice unless there's some advantage to another one.  But nobody's
figured out how to get from here to there.

(I think this is legacy from the pre-Unicode days, when the browser
simply displayed Web pages using to the system character set, which
led to a legacy of incompatible Web pages in different parts of the
world.)

-David

-- 
?   L. David Baron                         http://dbaron.org/   ?
?   Mozilla                           http://www.mozilla.org/   ?
Received on Wednesday, 30 November 2011 22:00:10 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:09 UTC