Re: Experimenting with video processing pipelines on the web from Dale Curtis on 2023-04-04 (public-media-wg@w3.org from April 2023)

From: Dale Curtis <dalecurtis@google.com>
Date: Tue, 4 Apr 2023 11:02:28 -0700
To: Francois Daoust <fd@w3.org>
Cc: "public-media-wg@w3.org" <public-media-wg@w3.org>, Dominique Hazaël-Massieux <dom@w3.org>, Bernard Aboba <Bernard.Aboba@microsoft.com>, Peter Thatcher <pthatcher@microsoft.com>
Message-ID: <CAPUDrwe44ehgXN5b+6TBotN8YPAng+nGWDVO-WcAP0-bSeieYw@mail.gmail.com>
Thanks for this list! I've tagged a couple issues for prioritization.

- dale

On Wed, Mar 29, 2023 at 2:14 AM Francois Daoust <fd@w3.org> wrote:

> Thanks, Dale!
>
> I cannot think of possible changes to WebCodecs itself that are not
> already captured in open issues. Main ones that come to mind being:
>
> 1. The ability to "detach" or otherwise transfer the frame buffer,
> tracked in:
>   https://github.com/w3c/webcodecs/issues/104
>   https://github.com/w3c/webcodecs/issues/287
>
> 2. The possibility to create a VideoFrame out of a GPUBuffer directly,
> instead of having to create a canvas which is not really needed in
> theory, and/or perhaps to know when a VideoFrame is GPU backed, tracked
> in:
>   https://github.com/w3c/webcodecs/issues/83


Can you elaborate on the use case for this? I can see it being useful for
readback or drawing, but I thought WebGPU already has its own mechanisms
for that.


>
>
> 3. It could perhaps be useful to expose an API for conversion between
> pixel formats to ease developers' lives, tracked in:
>   https://github.com/w3c/webcodecs/issues/92
>
> Other problems we bumped into sit more at the intersection with other
> specs:
>
> 4. If you send a VideoFrame over a WHATWG Stream across workers, you
> need to call `close()` on the sender side as well, but that cannot be
> done immediately in the generic case (since sending is async) and there
> is no direct way to tell when sending is over. Some ability to specify
> that this is meant to be a real transfer would be nice. I believe that
> this would be covered by the "Transferrring Ownership Streams" proposal:
>
>
> https://github.com/whatwg/streams/blob/main/streams-for-raw-video-explainer.md#transferring-ownership-streams-explained


This is tracked by https://github.com/whatwg/streams/issues/1185 it looks
like too. I've bumped that issue.


>
>
> 5. `importExternalTexture()` in WebGPU accepts a `VideoFrame` in Chrome,
> but that is not yet specified, tracked in:
>   https://github.com/gpuweb/gpuweb/issues/1380
>
> 6. It would be nice if there were a way to process (CPU-backed)
> VideoFrames with WebAssembly without incurring copies. I don't have a
> pointer at hand but I think this is still an open question in the
> WebAssembly group.
>

We designed copyTo so that there's at least only one copy in this case.
Until JS/WASM have read only buffer concepts, I don't think we can do
better than that (since decoder may still be referencing the buffer for
future frames).


>
> Main problematic one for us is 4. although it won't bite applications
> that stick to a one-worker-only approach for the processing logic. Other
> issues probably fall into the enhancement category.
>
> I note that we stuck to actual processing of video frames, and cannot
> comment on audio and on actual encoding/decoding features of WebCodecs.
>

Thanks for the excellent feedback!


>
> Francois.
>
>
> ------ Original message ------
> From: "Dale Curtis" <dalecurtis@chromium.org>
> To: "Francois Daoust" <fd@w3.org>
> Cc: "public-media-wg@w3.org" <public-media-wg@w3.org>; "Dominique
> Hazaël-Massieux" <dom@w3.org>; "Bernard Aboba"
> <Bernard.Aboba@microsoft.com>; "Peter Thatcher"
> <pthatcher@microsoft.com>
> Date: 28/03/2023 19:16:17
>
> >These are great articles. Thanks for writing and sharing!
> >
> >`copyTo` performance should get better over time in the GPU resource ->
> >GPU encoder case. I agree it remains hard to reason about now though.
> >The design goal was always to minimize copies as much as possible when
> >using the built-in drawing primitives like drawImage, texImage, and
> >importExternalTexture -- this works better on some GPUs and OSs than
> >others. E.g., In Chrome, macOS and ChromeOS have comprehensive GPU
> >memory buffer support available to both the renderer sandbox and gpu
> >process, making copies and transfers more efficient than other
> >platforms.
> >
> >Are there any changes you'd propose to WebCodecs based on your
> >experiences?
> >
> >- dale
> >
> >On Tue, Mar 28, 2023 at 10:03 AM Francois Daoust <fd@w3..org
> ><mailto:fd@w3.org>> wrote:
> >>Hi all,
> >>
> >>Following my sharing this experimentation with the Media Working Group
> >>back in December, Chad Hart and Philipp Hancke invited us to develop
> >>our
> >>explorations in a webrtcHacks article. We ended up writing an article
> >>in
> >>two parts, the latter of which got published today:
> >>
> >>Part 1 - Real-Time Video Processing with WebCodecs and Streams:
> >>Processing Pipelines
> >>
> https://webrtchacks.com/real-time-video-processing-with-webcodecs-and-streams-processing-pipelines-part-1/
> >>
> >>Part 2 - Video Frame Processing on the Web – WebAssembly, WebGPU,
> >>WebGL,
> >>WebCodecs, WebNN, and WebTransport
> >>
> https://webrtchacks.com/video-frame-processing-on-the-web-webassembly-webgpu-webgl-webcodecs-webnn-and-webtransport/
> >>
> >>I'm taking the liberty to shamelessly plug these articles here as
> >>writing them forced us to dig more into details and pushed us to play
> >>with additional technologies along the way, and I thought some of it
> >>might be of interest to the group, or at least relevant to discussions
> >>on possible evolutions of media pipeline architectures. On top of
> >>learning more about WebGPU (and eventually understanding that there is
> >>no need to wait before creating a new VideoFrame out of the processed
> >>one as I thought initially), the demo now also features basic
> >>processing
> >>with pure JavaScript code, with WebAssembly, and a couple of identity
> >>transforms meant to force CPU-to-GPU and GPU-to-CPU copies, to help
> >>measure and reflect on performance and memory copies. The second
> >>article
> >>summarizes some of our takeaways.
> >>
> >>Thanks,
> >>Francois.
> >>
> >>
> >>
> >>------ Original message ------
> >>From: "Francois Daoust" <fd@w3.org>
> >>To: "Bernard Aboba" <bernard.aboba@microsoft.com
> >><mailto:bernard.aboba@microsoft..com>>; "Peter Thatcher"
> >><pthatcher@microsoft.com>; "public-media-wg@w3.org"
> >><public-media-wg@w3.org>
> >>Cc: "Dominique Hazaël-Massieux" <dom@w3.org <mailto:dom@w3..org>>
> >>Date: 20/12/2022 12:10:03
> >>
> >> >Hi Bernard, Peter ,
> >> >Hi Media WG,
> >> >
> >> >A couple of months ago, prompted by discussions in this group on the
> >>evolutions of the media pipeline architecture and sample code and
> >>issues that you shared with the group, Dom and I thought we'd get more
> >>hands-on as well to better understand how different web technologies
> >>can be mixed to create video processing guidelines. Resulting code is
> >>at:
> >> >https://github.com/tidoust/media-tests/
> >> >
> >> >... with demo (currently requires Chrome with WebGPU enabled) at:
> >> >https://tidoust.github.io/media-tests/
> >> >
> >> >Following last Media Working Group call, I have now detailed our
> >>approach and the issues we bumped into in the README. I also created a
> >>couple of issues on the w3c/media-pipeline-arch repository:
> >> >https://github.com/w3c/media-pipeline-arch/issues/7
> >> >https://github.com/w3c/media-pipeline-arch/issues/8
> >> >
> >> >Our exploration is more a neophyte attempt at processing media than
> >>an expert take on technologies, and resulting considerations sit at a
> >>higher level than the issues you described during last Media working
> >>group call. We may also have missed obvious things that would make
> >>these considerations moot :)
> >> >
> >> >It's probably not worth taking working group meeting time for this
> >>but, if you think that's useful and if people are interested, we could
> >>perhaps organize some sort of TPAC-except-it's-not-TPAC informal
> >>breakout session to discuss this early 2023, opening up the session to
> >>people from other groups.
> >> >
> >> >Thanks,
> >> >Francois.
> >> >
> >>
>
Received on Tuesday, 4 April 2023 18:02:54 UTC