Re: Binary Data - possible topic for joint session

On Nov 6, 2009, at 8:26 AM, Brendan Eich wrote:

> On Nov 6, 2009, at 1:34 AM, Maciej Stachowiak wrote:
>
>> = Issues for the binary data API:
>>
>>   Name (potential bikeshed):
>>       ByteArray
>>       ByteVector
>>       BinaryData
>>       Data
>
> This isn't just rank bikeshedding:
>
> 1. Data is so common a name that we can't confidently inject it into  
> the global object without fear of breaking something. JSON, in spite  
> of json2.js precedent, was implemented incompatibly and object- 
> detected insufficiently, although this was corrected by the  
> implementors (Facebook folks, much appreciated). Google codesearch  
> results:
>
> http://www.google.com/codesearch?as_q=%22function+Data%28%22&btnG=Search+Code&hl=en&as_lang=javascript&as_case=y
> http://www.google.com/codesearch?as_q=%22var+Data;%22&btnG=Search+Code&hl=en&as_lang=javascript&as_case=y
> http://www.google.com/codesearch?as_q=%22var+Data%20=%22&btnG=Search+Code&hl=en&as_lang=javascript&as_case=y
>
> 2. Data is annoyingly close to Date.
>
> 3. Data is technically plural, and usage sometimes treats it as  
> plural (ok, this is almost bikeshedding, I admit). For a String  
> analogue this is awkward.

You're right that there are some objective factors which may rule out  
certain names, in addition to subjective taste concerns. I tried not  
to think too hard about the name in making the original proposal,  
since I figured there would be a range of opinion. Your stated reasons  
against Data seem decent.

>
>
>> I like "Data" and similar names. Objective-C has NSData as a  
>> distinct type for chunks of binary data - it's not treated as a  
>> type of array. I think this makes sense. Often the fact that a  
>> chunk of binary data can be treated as an octet sequence is  
>> incidental.
>
> It's not incidental unless you provide wider-than-byte element  
> access and address byte order. Let's not, in the interest of serving  
> API simplicity and common octet-sequence use-cases first and only  
> (if we can hold this line).

Indeed, I'd rather not propose APIs like that in the initial version  
(though I think eventually we may want a way to copy sequences of 16-,  
32-bit or 64-bit values swapping from network byte order to host byte  
order or vice versa to make it practical to interpret popular binary  
formats.

However, I think a common use case for binary data is to pass it  
around for point A to point B, without unpacking the internals at all,  
just as for strings. For example, you may read a file in binary form,  
pass the binary data off to a Worker, and then have the Worker upload  
it to a server. This is part of why I leaned towards a name that does  
not overly emphasize the byte sequence nature.

>
> Therefore I think a concrete name such as ByteVector or ByteArray is  
> better, all else equal.
>
> Moreover a name such as ByteVector is much easier to inject as a  
> global property. No hits for the obvious function and var forms of  
> it, one hit for ByteArray:
>
> http://www.google.com/codesearch?hl=en&lr=&q=%22function+ByteArray%28%22+lang%3Ajavascript&sbtn=Search

Some other possible names (based in part on some other binary data  
proposals that I've seen):

BinaryData
BinData
ByteString
Binary
Blob

Good topic for in-person discussion maybe?

Regards,
Maciej

Received on Friday, 6 November 2009 17:19:30 UTC