On Thursday 2011-12-01 14:37 +0900, Mark Callow wrote:
> On 01/12/2011 11:29, L. David Baron wrote:
> > The default varies by localization (and within that potentially by
> > platform), and unfortunately that variation does matter.
> In my experience this is what causes most of the breakage. It leads
> people to create pages that do not specify the charset encoding. The
> page works fine in the creator's locale but shows mojibake (garbage
> characters) for anyone in a different locale.
> If the default was ASCII everywhere then all authors would see mojibake,
> unless it really was an ASCII-only page, which would force them to set
> the charset encoding correctly.

Sure, if the default were consistent everywhere we'd be fine.  If we
have a choice in what that default is, UTF-8 is probably a good
choice unless there's some advantage to another one.  But nobody's
figured out how to get from here to there.

(I think this is legacy from the pre-Unicode days, when the browser
simply displayed Web pages using to the system character set, which
led to a legacy of incompatible Web pages in different parts of the


