- From: Glenn Maynard <glenn@zewt.org>
- Date: Mon, 26 Mar 2012 20:24:41 -0500
On Mon, Mar 26, 2012 at 7:33 PM, Jonas Sicking <jonas at sicking.cc> wrote: > Requiring callers to find the null character first, and then use that > will require one additional pass over the encoded binary data though. > That's extremely fast (memchr), and it's probably the fastest thing to do anyway, compared to embedding null-termination logic into the inner loop of decode functions. Unless there's a concrete benchmark showing that it's slower, and slower enough to actually matter, this shouldn't be a consideration. It's a premature optimization. Also, if we put the API for finding the null character on the Decoder > object it doesn't seem like we're creating an API which is easier to > use, just one that has moved some of the logic from the API to every > caller. > It doesn't seem materially harder (a little more code, yes, but that's not the same thing), and it's more general-purpose. The API for finding the character doesn't belong on Decoder. It should probably go on each View type, analogous to String.indexOf. Multi-byte views should search on the view's size; eg. Int16Array.indexOf(i) maps to wmemchr. Though I guess the best solution would be to add methods to DataView > which allows consuming an ArrayBuffer up to a null terminated point > and returns the decoded string. Potentially such a method could take a > Decoder object as argument. > I guess. It doesn't seem that important, since it's just a few lines of code. If this is done, I'd suggest that this helper API *not* have any special support for streaming (not to disallow it, but not to have any special handling for it, either). I think streaming has little overlap with null-terminated fields, since null-termination is typically used with fixed-size buffers. It would complicate things; for example, you'd need some way to signal to the caller that a null terminator was encountered. That is, it'd basically look like: function decodeNullTerminated(decoder, options) { // Create the correct array type, so array.find and array.subarray work in 16-bit for UTF-16. var arrayType = (decoder.encoding.toLowerCase() == 'utf-16le' || decoder.encoding.toLowerCase() == 'utf-16be')? Int16Array:Int8Array; var array = new arrayType(this.buffer, this.byteOffset, this.byteLength); var terminator = array.find(0); if(terminator != -1) array = array.subarray(0, terminator); return decoder.decode(array, options); } which doesn't specifically prohibit options including {stream: true}, but doesn't attempt to make it useful. (Side note: If you have null-terminated strings, you're almost always dealing with only multibyte encodings like UTF-8, or only wide encodings like UTF-16, so you'd just use the appropriate type. That is, the minor complication of the first line above isn't something that users would normally actually need to do.) On Mon, Mar 26, 2012 at 8:11 PM, Kenneth Russell <kbr at google.com> wrote: > The rationale for specifying the string encoding and decoding > functionality outside the typed array specification is to keep the > typed array spec small and easily implementable. The indexed property > getters and setters on the typed array views, and methods on DataView, > are designed to be implementable with a small amount of assembly code > in JavaScript engines. I'd strongly prefer to continue to design the > encoding/decoding functionality separately from the typed array views. > It doesn't need to go into the Typed Array spec. It can just be an addition to the interface provided by an external specification, which doesn't need to be implemented to implement typed arrays itself. I don't think it's an important thing to have, but this in particular doesn't seem like a problem. -- Glenn Maynard
Received on Monday, 26 March 2012 18:24:41 UTC