[whatwg] API for encoding/decoding ArrayBuffers into text

On Tue, Mar 13, 2012 at 4:11 PM, Glenn Maynard <glenn at zewt.org> wrote:

> On Tue, Mar 13, 2012 at 5:49 PM, Jonas Sicking <jonas at sicking.cc> wrote:
>
> > Something that has come up a couple of times with content authors
> > lately has been the desire to convert an ArrayBuffer (or part thereof)
> > into a decoded string. Similarly being able to encode a string into an
> > ArrayBuffer (or part thereof).
> >
>
> There was discussion about this before:
>
>
> https://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html
> http://wiki.whatwg.org/wiki/StringEncoding
>
> (I don't know why it was on the WebGL list; typed arrays are becoming
> infrastructural and this doesn't seem like it belongs there, even though
> ArrayBuffer was started there.)
>
> The API on that wiki page is a reasonable start.  For the same reasons that
> we discussed in a recent thread (
> http://lists.w3.org/Archives/Public/public-webapps/2011JulSep/1589.html),
> conversion errors should use replacement (eg. U+FFFD), not throw
> exceptions.  The "any" arguments should be fixed.  Encoding to UTF-16
> should definitely not prefix a BOM, and UTF-16 having unspecified
> endianness is obviously bad.
>
> I'd also suggest that, unless there's serious, substantiated demand for
> it--which I doubt--only major Unicode encodings be supported.  Don't make
> it easier for people to keep using legacy encodings.
>
>
Two other pieces of feedback I received from Adam Barth off list:

* take ArrayBufferView as input which both fixes "any" and simplifies the
API to eliminate byteOffset and byteLength
* support two versions of encode, one which takes a target ArrayBufferView,
and one which allocates/returns a new Uint8Array of the appropriate length.



> > Shouldn't this just be another ArrayBufferView type with special
> > semantics, like Uint8ClampedArray? DOMStringArray or some such? And/or a
> > getString()/setString() method pair on DataView?
>
> I don't think so, because retrieving the N'th decoded/reencoded character
> isn't a constant-time operation.
>
> --
> Glenn Maynard
>

Received on Tuesday, 13 March 2012 16:23:23 UTC