Re: [streams] Support reading bytes into buffers allocated by user code on platforms where only async read is available (#253)

OK. Here is my writeup against async read. Let me know what you think. I guess the points end up not being that strong after all.

## Arguments against async read()

As mentioned above I will attempt to lay out all the arguments against async read(), in a very one-sided way. Then we can attempt to step back and see how good they are.

### EOS is not very general

Example code:

```
let chunk;
while ((chunk = await rs.read()) !== ReadableStream.EOS) {
  doStuffWith(chunk);
}
```

If we want streams to be able to be generic data transports for any value in the language, the EOS solution fails. If `read()` fulfiling with EOS means the end of the stream, then that means it is impossible for streams to transport the EOS value itself.

A only-slightly-contrived situation where this might be important is if you are transforming a stream of JavaScript source text into a stream of values encountered; if the source text contains stream-manipulating code, then it will probably contain EOS, and this will break the stream.

Similarly, you can't have a generic function to translate an array into a stream, because what if someone uses it on `arrayOfConstantsImportantInWebAPIs`.

Also, what happens when the underlying source actually tries to enqueue EOS? Does that serve as a cryptic close signal? Do we error the stream? Do we pass it through the stream, but somehow not actually close the stream?

### EOS does not work for byte streams

For byte streams, calling `read(view)` transfers `view`'s backing buffer to a new array buffer. We need to always give a reference to that buffer back to the consumer, otherwise they can't reuse it. So, that means we can't fulfill with EOS: we have to instead fulfill with a zero-byte view onto the transferred buffer, or something similar.

### { value, done } is ugly

The alternative design to EOS is having every call to read() fulfill with a promise for a `{ value, done }` object. This is the solution preferred by the ES6 iterator protocol. It is completely general: anything can go in `value`, since only `done` is used to signal the end.

However, the iterator protocol has the benefit of abstracting this syntax away from the author, via the `for-of` loop. We don't have any such opportunity. Thus, if we choose { value, done } for readable streams, this forces authors to actually look at `value` and `done` properties:

```
let pair;
while (!(pair = await rs.read()).done) {
  doStuffWith(pair.value);
}
```

### Multiple calls to read() are confusing

What happens when you do

```js
rs.read().then(...);
rs.read().then(...);
```

?

Does the second call reject, since there's already a read in progress? If so, do we need to give some signal that this is happening, so that consumers can avoid running into that?

Or does it queue up some sort of "read request", so that it gets the next chunk after the one returned by the first read()? If this is the case, isn't it deceptive that these reads appear to all be running "in parallel"?

What about for byte streams?


```js
rbs.read(view1).then(...);
rbs.read(view2).then(...);
```

This seems to imply that we need the "read request" queue, since otherwise there is no way to feed multiple buffers in at once.

### Ping-pong is awkward (weak argument, I probably just coded it weird)

Consider a two-buffer pool scenario, with an async `pull(view)` + sync `read()` design:

```js
async function pingPong(rbs, processChunk) {
  const pool = [new ArrayBuffer(1024), new ArrayBuffer(1024)];

  await rbs.pull(pool[0]);
  let i = 0;
  while (rbs.state !== "closed") {
    await Promise.all([
      process(i % 2),
      rbs.pull(pool[(i + 1) % 2])
    ]);
    ++i;
  }

  function process(i) {
    const view = rbs.read();
    await processChunk(view);
    pool[i] = view.buffer;
  }
}
```

The flow here is fairly clear. We are, in parallel, feeding a buffer to the stream with `rbs.pull`, and reading + processing the already-pulled chunk. Once the chunk is processed, we can return it to the buffer. Once the fed buffer has been read into and the already-read buffer has been processed, we can repeat this.

Contrast this with an async read() version:

```js
async function pingPong(rbs, processChunk) {
  const pool = [new ArrayBuffer(1024), new ArrayBuffer(1024)];
  const views = [undefined, undefined];
  let currentView;

  await read(0);
  let i = 0;
  while (currentView.byteLength !== 0) {
    await Promise.all([
      process(i % 2),
      read((i + 1) % 2)
    ]);
    ++i;
  }
  
  function process(i) {
    await processChunk(views[i]);
    pool[i] = views[i].buffer;
  }
  
  function read(i) {
    currentView = views[i] = await rbs.read(pool[i]);
  }
}
```

Here we ended up using a lot more variables since processChunk is no longer next to read().

---
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/streams/issues/253#issuecomment-77462296

Received on Thursday, 5 March 2015 22:05:55 UTC