Re: [whatwg/streams] Should I expect this to support "zero-copy" data loading in some way? (#1109) from Mattias Buelens on 2021-03-08 (public-webapps-github@w3.org from March 2021)

From: Mattias Buelens <notifications@github.com>
Date: Mon, 08 Mar 2021 14:35:52 -0800
To: whatwg/streams <streams@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <whatwg/streams/issues/1109/793134184@github.com>

Your question is absolutely valid, and the Streams API can and should play an important role in this.

> Does the Streams API facilitate this -- loading data into views, to save on copy operations? I am not sure, having learned about "BYOB" readers, it would seem these were the solution here, but why can't I do this then:
> 
> ```js
> new Blob([ "foobar" ]).stream().getReader({ mode: "byob" });
> ```

The short answer: we're not there *yet*. Although the specification for readable byte streams has existed for a while, the first implementation has only [started shipping very recently with Chrome 89](https://chromestatus.com/feature/4535319661641728). And right now, they aren't yet integrated into the rest of the Web platform:
* Fetch API's `Response.body` (and `Request.body`) are currently still "regular" readable streams without BYOB support. There's an open issue for making them readable byte streams, I suggest you watch that if you're interested: whatwg/fetch#267.
* File API's `Blob.stream()` is also still a "regular" readable stream. I don't see a tracking issue for that though, so you may want to [open one](https://github.com/w3c/FileAPI/issues/).

> Forgive my ignorance, and the spec may have penetrated too deep into practical application here -- but shouldn't above be a perfect use-case for zero-copy loading of file data into memory available to _both_ the script and any WASM module it may run (which could use `Memory.prototype.buffer` to make a view on the memory and hand it to a BYOB reader's `read` call)?

I agree that this *should* work. You should be able to "reserve" a portion of your WASM memory to hold the received data, create a `Uint8Array` view on that portion and let the readable byte stream write data directly into that view, without needing to pass through the JavaScript heap.

Unfortunately, that doesn't work. `ReadableStreamBYOBReader.read(view)` **transfers** the view's backing `ArrayBuffer`, so that the stream has exclusive access to the buffer while it's being filled. (Eventually, you get back access to the buffer back through the fulfillment value of the `read(view)` promise.) However, the `ArrayBuffer` of a `WebAssembly.Memory` object is **not transferable**, so you can't actually pass a `view` that is backed by such a buffer to `read(view)`. 😞

I don't know if there's any intention to make this work, or if it's even possible to support this? I suppose things could get complicated very quickly, for example if the WebAssembly memory needs to grow while a readable byte stream is still `read()`ing into it...

Right now, the best you can do is allocate a separate `ArrayBuffer` in JavaScript, `read(view)` into that buffer and then copy the data to the desired location in your `WebAssembly.Memory`. This is still better than using "regular" readable stream (with `ReadableStreamDefaultReader.read()`), since you only **allocate** once and then re-use the buffer indefinitely for all future calls. With regular readable streams, every `read()` returns a new `Uint8Array` with a newly allocated `ArrayBuffer`. But yes, this is still *one* copy, not *zero* copy...

(By the way: if you happen to be using Rust for your "streaming to WebAssembly" use case, you may be interested in [wasm-streams](https://github.com/MattiasBuelens/wasm-streams). 😉 No support for readable byte streams just yet, but I may have a go at it now that they're available in Chrome. 😄)

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/streams/issues/1109#issuecomment-793134184

Received on Monday, 8 March 2021 22:36:05 UTC