W3C home > Mailing lists > Public > public-script-coord@w3.org > October to December 2009

Re: What is the use-case for binary data in client-side script?

From: Preston L. Bannister <preston@bannister.us>
Date: Thu, 12 Nov 2009 00:49:58 -0800
Message-ID: <7e91ba7e0911120049m73d3895boc3438768ee04bc0d@mail.gmail.com>
To: Maciej Stachowiak <mjs@apple.com>
Cc: public-script-coord@w3.org
I think the main relevant aspects are introspection (which implies some
variant of serialization/deserialization) and opaque objects. Thinking about
the following seems to lead to a single conclusion (at the end).

On Wed, Nov 11, 2009 at 1:46 AM, Maciej Stachowiak <mjs@apple.com> wrote:

> On Nov 10, 2009, at 10:38 PM, Preston L. Bannister wrote:
> First, I have to admit to not tracking every prior possibly-relevant
> discussion. I do not have that much free time. So I may be missing from
> context. (Did search through past discussions, looking for context.) Yes, I
> am stepping late into this discussion.
> Javascript is a nice higher-order language. Web browsers have rich
> knowledge of objects exchanged across HTTP.  I would hope and expect
> Javascript in the web browser to "know" generally about the objects known to
> a web browser, and be able to manipulate those objects, within the
> capabilities of the web browser.
> On Wed, Nov 4, 2009 at 4:26 PM, Maciej Stachowiak <mjs@apple.com> wrote:
>> Many APIs being developed for the Web platform would benefit from a good
>> way to store binary data. It would be useful for this to be specified as
>> part of the ECMAScript language, but it's also plausible to make this a W3C
>> spec that's only intended for use with Web platform APIs. Here is an
>> overview of some of the APIs that could use such a data type, some notes on
>> requirements and design alternatives, and a strawman proposal.
> What is not clear to me is whether binary data has any place in client-side
> Javascript. Strictly speaking, there is no such thing as binary data. Binary
> data is just a serialized representation of an object. I would greatly
> prefer that object serialization and de-serialization occur in the native
> code of the web browser, and not in Javascript (both for efficiency, and
> brevity in script). If the object in question is one of a kind known to the
> web browser, I would hope to leverage the web browser code.
> That's an appealing approach, when it is practical. But there's two
> important limitations:

Changed the ordering below, hopefully for clarity.

3) Sometimes the goal is not to interpret binary data on the client side,
> but simply to transfer it elsewhere, perhaps after some minimal processing.
> For example, you may want to store a binary file in local storage, then chop
> it into pieces and upload to a server one chunk at a time.

An interesting point - but for this usage do we need introspection, or is an
opaque object sufficient? I read this as two use-cases:

   1. File upload/download - where introspection is on server-side, or in
   applications outside the web application.
   2. Storage of web application data in the client, with introspection on
   the server.

Seems as though an object opaque to script may be sufficient. We want to
send or receive chunks of data, but do not need to look at the contents of
the chunks.

4) Even some forms of text manipulation are at some level binary-like, for
> example, character set transcoding. There's not even a good way to support
> transcoding without some way to handle the input or output as binary.

Also an interesting point, though perhaps this leads to a different
conclusion. Clearly the web browser contains support for transcoding, so
this should cost almost nothing to expose to script. Script lives and
breathes Unicode. If text is transcoded into another character set, is there
any use other than as an opaque object to be sent elsewhere? Again, this
seems best suited for an opaque object (with an encoding property).

> 1) There's binary file formats that are not natively understood by the
> browser. I think it would be short-sighted to say that a Web client just
> can't work with those. For example, bz2 compression is pretty neat, and it
> seems like it would be cool to be able to do that on the client.

I am not sure there is desirable usage here. To do BZip2 compression in
Javascript strikes me as an enormously bad idea. CPU use in native code is
intense, and script would be much worse. The same applies to encryption,
with the bonus risk of bad implementation. Is there remaining practical and
desirable usage?

On the flip side, exposing the browser ability to perform compression (and *
perhaps* encryption) in some form may be useful and efficient.

> 2) Sometimes the browser's built-in processing capabilities don't match
> what you want to do in client-side code. For example, Safari has native PDF
> viewing capability, but what if you want to find the text strings in a PDF
> file and make a searchable index? Then the browser's native processing
> doesn't help. Similarly, we'll happily un-gzip content that's sent using
> HTTP gzip compression, but that does not help you if you have a .gz file.

If the web browser can introspect into an object, then the browser could
choose to expose methods allowing access to the object. Seems that if Safari
receives an "application/pdf" object, then Safari could (perhaps) easily
expose to script PDF-related attributes and operations. Given native browser
support you could write elegant and efficient script to do interesting (and
potentially very cool) things with the contents.

On the flip side, writing script to introspect the raw bytes of a PDF file
would be pretty horrendous (fat and inefficient). Another for the non-goal
column. :)

In the browser, mime-types generally define object-classes. A "text/html"
> object might be a DOM-tree, or a string. A "text/json" object might be a
> Javascript object, or a string. An "image/jpeg" object is a jPEG image,
> hopefully with methods and attributes reflective of the web browser's
> understanding of a JPEG image.
> I am somewhat ambivalent about supporting arbitrary serialization and
> deserialization in client-side Javascript. I am not fond of the notion of
> byte-oriented serialization/deserialization in Javascript code. Maybe there
> is a compelling use-case for this sort of usage, but I am inclined to be
> dubious.
> The only use-cases that come to mind are:
>    1. Raw disk read from server disk, and shipped without interpretation
>    to the client.
>    2. Web connections to legacy services that only chat using only
>    non-HTML binary data.
> There's also handling of binary data in a format not understood by the
> client. Perhaps the client-side JavaScript is patching over missing browser
> capabilities, or the Web app has invented a new binary format itself.

Given the rich potential in the web browser - if exposed - I tend to doubt
there are desirable remaining use-cases. Is there are good use-case for a
new binary format in a web application (between client and server)? Sounds

> Personally, in both cases I would (for a number of reasons) choose to do
> the interpretation on the server, and only ship web-browser-friendly objects
> to the client.
> Even to transfer binary data to the server requires a form to represent it.
> But beyond that, it seems to me that not all Web developers share your
> preference, and there is no obvious harm from meeting their use cases.

True, the base cost in supporting raw access to binary data is minor (unless
we get overly elaborate in attributes, operations, and semantics).  But ...
I rather suspect the obvious use-cases are in fact not useful or
appropriate, in the context of client script.

> I am not sure there should be a use-case for binary data in client-side
> Javascript.
> On the flip side, I seem to have missed the discussion of mime-type object
> support, and the methods and attributes mapped to such objects.
> I'm not sure what you mean by "mime-type object". Would you like to
> clarify?

Seems to me that most (all?) the use-cases for binary (above and that come
to mind) are better answered with script access to what should be a common
native object with a web browser. Web browsers know quite a lot about the
HTTP request and HTTP response. The attributes of an HTTP
natural object for a web browser to support, and to expose for script)
expose a rich set of operations. The HTTP headers specify character set,
compression, and - with "Content-Type" - the interpretation of the contents.

The mime-type of a message controls the sort of introspection a web browser
applies to the contents. Would be (amazingly) useful - and quite natural to
expose this to script.

Has this been discussed before?
Received on Thursday, 12 November 2009 08:50:39 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 8 May 2013 19:30:02 UTC