[streams] New proposal for byte stream uses cases (#289) from Domenic Denicola on 2015-02-26 (public-webapps-github@w3.org from February 2015)

From: Domenic Denicola <notifications@github.com>
Date: Thu, 26 Feb 2015 11:58:39 -0800
To: whatwg/streams <streams@noreply.github.com>
Message-ID: <whatwg/streams/issues/289@github.com>
As recently discussed in #253 and #288 there are several open issues with our current design for readable byte streams. The biggest ones are:

- async ready + sync readInto model does not work very well for read(2)-in-a-threadpool scenarios, like file handles, which require providing the buffer at the async time instead of at the sync read (#253)
- async readInto models have the potential to cause observable data races, as another thread fills up the buffer (https://github.com/whatwg/streams/issues/253#issuecomment-75197506 and elsewhere).

At the same time I think it is important that, as a guiding principle, we try to make readable byte streams deviate from other readable streams as little as possible, in author-facing interface. The entire point of streams as a primitive is to have a shared abstraction useful across lots of code, that many people build libraries on top of. Those libraries should ideally work just as well with readable byte streams as they do readable streams, without any extra work needed. That is why I have continually insisted that ReadableByteStream support the ReadableStream interface. This also is what motivated #288---at the time I was convinced that async readInto was a good solution to our problems for ReadableByteStream, which made me want to align ReadableStream with its own async read.

I think I have a new idea that satisfies our constraints. I go into more detail in https://gist.github.com/domenic/e251e37a300e51c5321f where you can see some evolution going on. The idea is that **we revert from .ready to .wait(), and add an optional buffer parameter to .wait()**. Here is some sample code from the gist:

```js
async function chunkwise(rbs, processChunk) {
  let current = new ArrayBuffer(ONE_MIB);

  for (const i = 0; i < 10; ++i) {
    // Detaches current, transfering it to newAB
    // Begins the fread(newAB, 0, newAB.byteLength /* === ONE_MIB */) call
    await rbs.wait(current); // fulfills when the first fread(newAB, 0, ONE_MIB) call finishes
    current = rbs.read();    // return value is newAB, i.e. not === ab

    // Right now nothing is happening since even though the stream is empty, there is no buffer it can use to read.

    await processChunk(current);
  }
}
```

I think this idea is pretty nice, for a few reasons:

- If I omit the optional parameter to wait(), then wait() can auto-allocate a buffer for me. (The underlying source passed to the ReadableByteStream constructor could determine how, perhaps with an allocator function. Or we could let authors do it directly?) Thus, generic stream consumers that don't know they are dealing with a byte stream are in good shape.
- Given that we have to have transfer semantics of some sort to avoid observable data races, I think this is the most intuitive possibility. Unlike alternatives such as `read(sourceAB, offset, bytesDesired) -> Promise<{ result, bytesRead }>`, this clearly separates supplying a buffer to the stream from getting a chunk from the stream. This kind of cognitive distance between the two operations helps make the transfer process feel less unnatural.
- In general, wait() being a function is a bit more future-proof, so we can add semantics like this to other streams or stream-alike APIs. For example, I wonder what will happen when we start designing WritableByteStream.

It has one downside I want to highlight:

- Unlike readInto designs, this does not let you do multiple reads into the same buffer. Until/unless we get [segment-wise detachment for array buffers](https://esdiscuss.org/topic/improving-detachment-for-array-buffers), I don't think this is possible without observable data races. However there is an API that would allow multiple reads into the same pre-allocated backing memory: the `read(sourceAB, offset, bytesDesired) -> Promise<{ result, bytesRead }>` API. This API is very awkward though and I can't see a clear way to redeem it. So I think we'd want a very compelling use case before doing so.

What do you guys think? I'd like to get this decided soon since I know there are implementations of ReadableStream under way and, per my long paragraph, our decisions on ReadableByteStream will impact ReadableStream.

@tyoshino @yutakahirano @wanderview @calvaris 

---
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/streams/issues/289
Received on Thursday, 26 February 2015 19:59:08 UTC