W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2011

[whatwg] Default encoding to UTF-8?

From: Glenn Maynard <glenn@zewt.org>
Date: Fri, 2 Dec 2011 11:29:17 -0500
Message-ID: <CABirCh-ssAXDXa=+J_2YXYzjSWt9sksRaNpb4VDLARSWQHz+YA@mail.gmail.com>
On Fri, Dec 2, 2011 at 10:46 AM, Henri Sivonen <hsivonen at iki.fi> wrote:

> Regarding your "(and 16)" remark, considering my personal happiness at
> work, I'd prioritize the eradication of UTF-16 as an interchange
> encoding much higher than eradicating ASCII-based non-UTF-8 encodings
> that all major browsers support. I think suggesting a solution to the
> encoding problem while implying that UTF-16 is not a problem isn't
> particularly appropriate. :-)

UTF-16 is definitely terrible for interchange (it's terrible for internal
use, too, but we're stuck with that), and I'm all for anything that
prevents its proliferation.

I don't think I'd call it a bigger problem, though, since it's
comparatively (even vanishingly) rare, where untagged legacy encodings are
a widespread problem that gets worse every day we can't think of a way to
curtail it.

I don't have any new ideas for doing that, either, though.

I think in order to comply with the Support Existing Content design
> principle (even if it unfortunately means that support is siloed by
> locale) and in order to make plans that are game theoretically
> reasonable (not taking steps that make users migrate to browsers that
> haven't taken the steps), I think we shouldn't change the fallback
> encodings from what the HTML5 spec says when it comes to loading
> text/html or text/plain content into a browsing context.

And no browser vendor would ever do this, no matter what the spec says,
since nobody's willing to break massive swaths of existing content.

Glenn Maynard
Received on Friday, 2 December 2011 08:29:17 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:38 UTC