[whatwg] Endianness of typed arrays

On Wed, Mar 28, 2012 at 3:46 AM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
> On 3/28/12 3:14 AM, Mark Callow wrote:
>
>> vertexAttribPointer lets you specifiy to WebGL the layout and type of
>> the data in the buffer object.
>
> Sure. ?That works for the GPU, but it doesn't allow for the sort of
> on-the-fly endianness conversion that would be needed to make webgl still
> work on big-endian platforms if the JS-visible typed arrays were always
> little-endian.
>
>
>> The API follows OpenGL {,ES} for familiarity and reflects its
>> heritage of a C API avoiding use of structures.
>
> Yep. ?I know the history. ?I think this was a mistake, if we care about the
> web ever being usable on big-endian hardware. ?Whether we do is a separate
> question.
>
>> But it works.
>
> Sort of, but maybe not; see below.
>
>
>> OpenGL {,ES} developers typically load data from a serialized form and
>> perform endianness conversion during deserialization. The serialized
>> form is what would be loaded into an ArrayBuffer via XHR. It is then
>> deserialized into 1 or more additional ArrayBuffers.
>
>
> The point is that developers are:
>
> 1) ?Loading data in serialized forms that has nothing to do with WebGL
> ? ?via XHR and then reading it using typed array views on the
> ? ?resulting array buffer.
> 2) ?Not doing endianness conversions, either for the use case in point
> ? ?1 or indeed for WebGL.
>
> Again, I think we all agree how this would work if everyone using the typed
> array APIs were perfect in every way and had infinite resources. But they're
> not and they don't... The question is where we go from here.
>
> In practice, it sounds like a UA on a big-endian system has a few options:
>
> A) ?Native-endianness typed arrays. ?Breaks anyone loading data via XHR
> arraybuffer responses (whether for WebGL or not) and not doing manual
> endianness conversions.
>
> B) ?Little-endian typed arrays. ?Breaks WebGL, unless developers switch to a
> more "struct-based" API. ?Makes the non-WebGL cases of XHR arraybuffer
> responses work.
>
> C) ?Try to guess based on where the array buffer came from and have
> different behavior for different array buffers. ?With enough luck (or good
> enough heuristics), would make at least some WebGL work, while also making
> non-WebGL things loaded over XHR work.
>
> In practice, if forced to implement a UA on a big-endian system today, I
> would likely pick option (C).... ?I wouldn't classify that as a victory for
> standardization, but I'm also not sure what we can do at this point to fix
> the brokenness.

The top priority should be to implement DataView universally. DataView
is specifically designed for correct, portable manipulation of binary
data coming from or going to files or the network. Fortunately,
DataView is supported in nearly every actively developed UA; once
https://bugzilla.mozilla.org/show_bug.cgi?id=575688 is fixed, it
should be present in every major UA -- even the forthcoming IE 10! See
http://blogs.msdn.com/b/ie/archive/2011/12/01/working-with-binary-data-using-typed-arrays.aspx
.

Once DataView is available everywhere then the top priority should be
to write educational materials regarding binary I/O. It should be
possible to educate the web development community about correct
practices with only a few high profile articles.

Changing the endianness of Uint16Array and the other multi-byte typed
arrays is not a feasible solution. Existing WebGL programs already
work correctly on big-endian architectures specifically because the
typed array views use the host's endianness. If the typed array views
were changed to be explicitly little-endian, it would be a requirement
to introduce new big-endian views, and all applications using typed
arrays would have to be rewritten, not just those which use WebGL.

Finally, to reiterate one point: the typed array design was informed
by prior experience with the design and performance characteristics of
a similar API, specifically Java's New I/O (NIO) Buffer classes. NIO
merged the two distinct use cases of file and network I/O, and
interaction with graphics and audio devices, into one API. The result
was increased polymorphism at call sites, which defeated the Java VM's
optimizing compiler and led to 10x slowdowns in many common
situations. It was so difficult to fix these performance pitfalls that
they remained for many years, and I don't know how robust the
solutions are in current Java VMs. To avoid these issues the typed
array spec consciously treats these use cases separately. It is
possible to make incorrect assumptions leading to non-portable code,
but at some level this is possible with nearly any API that extends
beyond a small, closed world. I believe the focus should be on
educating developers about correct use of the APIs, developing
supporting libraries to ease development, and advancing the ECMAScript
language with constructs like struct types
(http://wiki.ecmascript.org/doku.php?id=harmony:binary_data).

-Ken

Received on Wednesday, 28 March 2012 12:08:50 UTC