[whatwg] API for encoding/decoding ArrayBuffers into text from Glenn Maynard on 2012-03-20 (public-whatwg-archive@w3.org from March 2012)

From: Glenn Maynard <glenn@zewt.org>
Date: Mon, 19 Mar 2012 19:10:22 -0500
Message-ID: <CABirCh8T5Uz_PqKArE59=ZQg-y_4wYB9tG7FXXMEfGNs-RvRyg@mail.gmail.com>

On Mon, Mar 19, 2012 at 12:46 PM, Joshua Bell <jsbell at chromium.org> wrote:

> I have edited the proposal to base the list of encodings on
>
http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html - is there any
> reason that would not be sufficient or appropriate? (this appears to be a
> superset of the validator.nu/?charset list, with only a small number of
> additional encodings)
>

There are lots of encodings in that list which browsers need to support for
legacy text/html content, which are probably completely unnecessary here.
 People may be storing Shift-JIS text in ID3 tags, but I doubt they're
doing that with ISO-2022-JP.

I'm undecided about legacy encodings in general, but that aside, I'd start
from just ["UTF-8"], and add to the list based on concrete use cases.
 Don't start from the whole list and try to pare it down.

I wonder if we can't limit the damage of extending more support to legacy
encodings.  We have a use case for decoding legacy charsets (ID3 tags), but
do we have any use cases for encoding to them?  If you're writing back
changed ID3 tags, you should be writing it back in as ID3v2 (which is all
most tagging software writes to now), which uses UTF-8.

On Mon, Mar 19, 2012 at 5:54 PM, Jonas Sicking <jonas at sicking.cc> wrote:

> Yes, I think we should enumerate the set of encodings supported.
> Ideally we'd for simplicity support the same set of enumerated
> encodings everywhere in the platform and over time try to shrink that
> set.
>

Shrinking the set supported for HTML will be much harder than keeping this
set small to begin with.

-- 
Glenn Maynard

Received on Monday, 19 March 2012 17:10:22 UTC