Re: [streams] First draft at tee algorithms, for critique (#302) from Domenic Denicola on 2015-03-18 (public-webapps-github@w3.org from March 2015)

From: Domenic Denicola <notifications@github.com>
Date: Wed, 18 Mar 2015 07:08:59 -0700
To: whatwg/streams <streams@noreply.github.com>
Message-ID: <whatwg/streams/pull/302/c82989584@github.com>
> Do you mean "ReadableByteStream.prototype.tee() then returns SpeculativeTeeReadableByteStream(this)"? So SpeculativeTeeReadableByteStream() conforms to the same semantics at TeeReadableStream()?

Yes, sorry. My "speculative" adjective is indeed confusing things; it's only meant there as "if we actually had ReadableByteStream in the spec/refernece implementation, I think this is what it would look like." It's not about the semantics of the tee. So, implicitly, if we actually had a ReadableByteStream to add a ReadableByteStream.prototype.tee too, I would probably have removed the "Speculative" prefix at that point.

And yes, the semantics should be the same.

> Defining the contract that all tee() functions must conform to separately might help me. I like to see interface separate from implementation, etc.

Definitely. Roughly, I think it would be:

- `X.prototype.tee()` returns `[branch1, branch2]` where both are `X` instances.
- `X.prototype.tee()` will lock `this`, making it no longer usable.
- Both returned branches will be unlocked, and you can separately call `getReader()` on them, with any arguments it might support (like `{ feedBuffers: true }` for RBS). You can then read from them independently.
- If tee was done with cloning (perhaps `stream.tee({ clone: true })`? Certainly `res.clone()` will need to do structured-cloning, since e.g. `cache.put(res.clone())` will start manipulating the `res` stream in another thread), then reading from the branches should give independent objects (and, in the case of byte streams, not backed by the same buffer).
- Otherwise, reading from each branch should give the same corresponding objects.
- If neither branch is read from, backpressure should be applied to the original stream.
- If both branches are canceled, the original stream should be canceled.

>  It seems the structure clone must be embedded in the postMessage() transfer to me. Doing it here is duplicative and kind of wasteful.

I think I see. So, I think postMessaging a stream actually *doesn't* use the tee functionality. Instead it just grabs a reader and sends the bytes over the wire to the counterpart stream. So similar to how when you `cache.put(res)`, when you `postMessage(res)`, or `postMessage(res.body)`, the stream is used up. You need to clone/tee it first.

So in particular,

>  Even if you tee() first and then do postMessage(branch2), the transfer will still have to structure clone again since branch2 is still accessible in the original JS context. 

If you do this, `branch2` will be locked for the rest of its life, until the postMessage algorithm has drained all its contents.

> The one case I see where the clone argument makes sense for single JS context like this is if the ReadableStream chunks are mutable.

Right, that's pretty much all objects in JS, including `ArrayBuffer`s :).

>  From the consumers point of view they are exactly the same as the original. The consumer just sees a stream interface with a read() method.

Yeah, I meant, in terms of them being different implementations of the same interface (and thus in JS, different classes). This might manifest as creating a new stream class of some sort, TeeBranchReadableByteStream or something, which reaches into the innards of its parent as arranged by the tee algorithm. I am unsure this makes that much sense given that we need to clone anyway for pretty much all cases, so it's not like we gain efficiency by having multiple pointers into the same buffer. But, maybe it can help me avoid that extra allocation I noted in the example...

---
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/streams/pull/302#issuecomment-82989584
Received on Wednesday, 18 March 2015 14:09:28 UTC