Re: [whatwg/streams] TransformStream probably doesn't need two strategies (#190) from Adam Rice on 2016-10-26 (public-webapps-github@w3.org from October 2016)

From: Adam Rice <notifications@github.com>
Date: Tue, 25 Oct 2016 20:10:57 -0700
To: whatwg/streams <streams@noreply.github.com>
Message-ID: <whatwg/streams/issues/190/256238505@github.com>

Suppose we want the normal Reader and Writer interfaces to work with TransformStream. `pipeTo()` is defined in terms of those interfaces, so I think we do. Also, it would break the principle of least surprise if you couldn't do `getReader()` on the `readable` and `getWriter()` on the `writable`.

In that case, TransformStream needs two queues. The `writable` queue permits calling `write()` while an async `transform()` is in progress, and the `readable` queue handles the mismatch between the _push_ `enqueue()` interface inside `transform()` and the _pull_ `read()` interface exposed by the Reader.

In general, when a TransformStream is connected to a pipe on both ends, those queues can go away. An important exception is when HWM is infinity on the TransformStream. In this case, the pipe absorbs any amount of backpressure.

In general, if you set HWM to some amount of megabytes or chunks which is larger than the strategy on either side then you get that much caching introduced into your pipe.

I think this is useful behaviour and should be maintained even when using `pipeTo()` (while still short-circuiting the queues as much as possible).

So, I've made an argument for having one strategy, now can I argue for two?

I don't think so.

As far as I can tell, you can get this useful behaviour by setting the strategy on either the `readable` or `writable` side. It doesn't matter which, just don't set it on both.

Now I want to consider what happens when `transform()` is slow. I'm going to call a configuration where the `writable` size has a small HWM and the `readable` size has a large HWM a small/large configuration, and the opposite a large/small configuration.

In a small/large configuration, the Writer has to wait until `transform()` is done before writing more. The Reader also has to wait, just because the data isn't available yet. When `transform()` is the bottleneck, it's no different from small/small.

In a large/small configuration, the Writer doesn't have to wait until the queue fills. For sufficiently small data, the Writer never has to wait at all. The Reader still has to wait for `transform()`.

Even in a large/large configuration, the Reader can read no faster than `transform()` can run. In other words, large/large is identical to large/small.

So, for sufficiently fast `transform()`, small/large and large/small are equivalent. When `transform()` is the bottleneck, then small/large is equivalent to small/small, but large/small provides extra buffering.

I therefore conclude that large/small is the most useful configuration, and therefore it is the `writable` side that should have the configurable strategy.

As a way forward, I suggest we write tests for the above cases and verify that they work as I have described, and then remove the ability to configure the `readable` strategy.

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/streams/issues/190#issuecomment-256238505

Received on Wednesday, 26 October 2016 03:11:28 UTC