Re: Experimenting with video processing pipelines on the web from Francois Daoust on 2023-03-29 (public-media-wg@w3.org from March 2023)

From: Francois Daoust <fd@w3.org>
Date: Wed, 29 Mar 2023 09:14:25 +0000
To: "Dale Curtis" <dalecurtis@chromium.org>
Cc: "public-media-wg@w3.org" <public-media-wg@w3.org>, Dominique Hazaël-Massieux <dom@w3.org>, "Bernard Aboba" <Bernard.Aboba@microsoft.com>, "Peter Thatcher" <pthatcher@microsoft.com>
Message-Id: <emf8b52f86-c6b2-4a1c-985b-59dac3103a51@18ae9f3c.com>
Thanks, Dale!

I cannot think of possible changes to WebCodecs itself that are not 
already captured in open issues. Main ones that come to mind being:

1. The ability to "detach" or otherwise transfer the frame buffer, 
tracked in:
  https://github.com/w3c/webcodecs/issues/104
  https://github.com/w3c/webcodecs/issues/287

2. The possibility to create a VideoFrame out of a GPUBuffer directly, 
instead of having to create a canvas which is not really needed in 
theory, and/or perhaps to know when a VideoFrame is GPU backed, tracked 
in:
  https://github.com/w3c/webcodecs/issues/83

3. It could perhaps be useful to expose an API for conversion between 
pixel formats to ease developers' lives, tracked in:
  https://github.com/w3c/webcodecs/issues/92

Other problems we bumped into sit more at the intersection with other 
specs:

4. If you send a VideoFrame over a WHATWG Stream across workers, you 
need to call `close()` on the sender side as well, but that cannot be 
done immediately in the generic case (since sending is async) and there 
is no direct way to tell when sending is over. Some ability to specify 
that this is meant to be a real transfer would be nice. I believe that 
this would be covered by the "Transferrring Ownership Streams" proposal:
  
https://github.com/whatwg/streams/blob/main/streams-for-raw-video-explainer.md#transferring-ownership-streams-explained

5. `importExternalTexture()` in WebGPU accepts a `VideoFrame` in Chrome, 
but that is not yet specified, tracked in:
  https://github.com/gpuweb/gpuweb/issues/1380

6. It would be nice if there were a way to process (CPU-backed) 
VideoFrames with WebAssembly without incurring copies. I don't have a 
pointer at hand but I think this is still an open question in the 
WebAssembly group.

Main problematic one for us is 4. although it won't bite applications 
that stick to a one-worker-only approach for the processing logic. Other 
issues probably fall into the enhancement category.

I note that we stuck to actual processing of video frames, and cannot 
comment on audio and on actual encoding/decoding features of WebCodecs.

Francois.


------ Original message ------
From: "Dale Curtis" <dalecurtis@chromium.org>
To: "Francois Daoust" <fd@w3.org>
Cc: "public-media-wg@w3.org" <public-media-wg@w3.org>; "Dominique 
Hazaël-Massieux" <dom@w3.org>; "Bernard Aboba" 
<Bernard.Aboba@microsoft.com>; "Peter Thatcher" 
<pthatcher@microsoft.com>
Date: 28/03/2023 19:16:17

>These are great articles. Thanks for writing and sharing!
>
>`copyTo` performance should get better over time in the GPU resource -> 
>GPU encoder case. I agree it remains hard to reason about now though. 
>The design goal was always to minimize copies as much as possible when 
>using the built-in drawing primitives like drawImage, texImage, and 
>importExternalTexture -- this works better on some GPUs and OSs than 
>others. E.g., In Chrome, macOS and ChromeOS have comprehensive GPU 
>memory buffer support available to both the renderer sandbox and gpu 
>process, making copies and transfers more efficient than other 
>platforms.
>
>Are there any changes you'd propose to WebCodecs based on your 
>experiences?
>
>- dale
>
>On Tue, Mar 28, 2023 at 10:03 AM Francois Daoust <fd@w3..org 
><mailto:fd@w3.org>> wrote:
>>Hi all,
>>
>>Following my sharing this experimentation with the Media Working Group
>>back in December, Chad Hart and Philipp Hancke invited us to develop 
>>our
>>explorations in a webrtcHacks article. We ended up writing an article 
>>in
>>two parts, the latter of which got published today:
>>
>>Part 1 - Real-Time Video Processing with WebCodecs and Streams:
>>Processing Pipelines
>>https://webrtchacks.com/real-time-video-processing-with-webcodecs-and-streams-processing-pipelines-part-1/
>>
>>Part 2 - Video Frame Processing on the Web – WebAssembly, WebGPU, 
>>WebGL,
>>WebCodecs, WebNN, and WebTransport
>>https://webrtchacks.com/video-frame-processing-on-the-web-webassembly-webgpu-webgl-webcodecs-webnn-and-webtransport/
>>
>>I'm taking the liberty to shamelessly plug these articles here as
>>writing them forced us to dig more into details and pushed us to play
>>with additional technologies along the way, and I thought some of it
>>might be of interest to the group, or at least relevant to discussions
>>on possible evolutions of media pipeline architectures. On top of
>>learning more about WebGPU (and eventually understanding that there is
>>no need to wait before creating a new VideoFrame out of the processed
>>one as I thought initially), the demo now also features basic 
>>processing
>>with pure JavaScript code, with WebAssembly, and a couple of identity
>>transforms meant to force CPU-to-GPU and GPU-to-CPU copies, to help
>>measure and reflect on performance and memory copies. The second 
>>article
>>summarizes some of our takeaways.
>>
>>Thanks,
>>Francois.
>>
>>
>>
>>------ Original message ------
>>From: "Francois Daoust" <fd@w3.org>
>>To: "Bernard Aboba" <bernard.aboba@microsoft.com 
>><mailto:bernard.aboba@microsoft..com>>; "Peter Thatcher"
>><pthatcher@microsoft.com>; "public-media-wg@w3.org"
>><public-media-wg@w3.org>
>>Cc: "Dominique Hazaël-Massieux" <dom@w3.org <mailto:dom@w3..org>>
>>Date: 20/12/2022 12:10:03
>>
>> >Hi Bernard, Peter ,
>> >Hi Media WG,
>> >
>> >A couple of months ago, prompted by discussions in this group on the 
>>evolutions of the media pipeline architecture and sample code and 
>>issues that you shared with the group, Dom and I thought we'd get more 
>>hands-on as well to better understand how different web technologies 
>>can be mixed to create video processing guidelines. Resulting code is 
>>at:
>> >https://github.com/tidoust/media-tests/
>> >
>> >... with demo (currently requires Chrome with WebGPU enabled) at:
>> >https://tidoust.github.io/media-tests/
>> >
>> >Following last Media Working Group call, I have now detailed our 
>>approach and the issues we bumped into in the README. I also created a 
>>couple of issues on the w3c/media-pipeline-arch repository:
>> >https://github.com/w3c/media-pipeline-arch/issues/7
>> >https://github.com/w3c/media-pipeline-arch/issues/8
>> >
>> >Our exploration is more a neophyte attempt at processing media than 
>>an expert take on technologies, and resulting considerations sit at a 
>>higher level than the issues you described during last Media working 
>>group call. We may also have missed obvious things that would make 
>>these considerations moot :)
>> >
>> >It's probably not worth taking working group meeting time for this 
>>but, if you think that's useful and if people are interested, we could 
>>perhaps organize some sort of TPAC-except-it's-not-TPAC informal 
>>breakout session to discuss this early 2023, opening up the session to 
>>people from other groups.
>> >
>> >Thanks,
>> >Francois.
>> >
>>
Received on Wednesday, 29 March 2023 09:14:30 UTC