Re: [webrtc-encoded-transform] Is frame-level fanout in scope of WebRTC encoded transform? (#211) from guidou via GitHub on 2023-11-29 (public-webrtc-logs@w3.org from November 2023)

From: guidou via GitHub <sysbot+gh@w3.org>
Date: Wed, 29 Nov 2023 13:52:45 +0000
To: public-webrtc-logs@w3.org
Message-ID: <issue_comment.created-1831937358-1701265963-sysbot+gh@w3.org>

> > enqueued on the RTCRtpSenderEncodedSource of the output PCs after invoking a `setMetadata()` method to adjust frame IDs to the output peer connection
> 
> Why would we need to call `setMetadata()`? This API basically replaces the encoder which does not have the notion of a frameID. Shouldn't it be the backend that computes it?

A use case we want to support[1] is to get frames from two or more incoming PCs that are sending the same encoded data, and forward the frames to one or more output PCs as if they were coming from a single, more reliable peer connection.  This forwarding occurs at multiple points in multiple network paths between the original servers sending the data and the final recipients and it involves dropping duplicate frames. Failure of individual input peer connections should be tolerated without the next node in the path noticing it.

One of the requirements for this forwarding mechanism is to preserve the existing metadata of the source frames. In particular:
* CSRCs. It is necessary to preserve the CSRCs of the incoming streams since they are used by the destination node for various application features.
* Dependency descriptors and frame IDs, so that forwarding nodes can make better decisions involving scalable streams (e.g., drop temporal layers to save bandwidth). They may also be needed to do decoding correctly. 
* RTP timestamps, so that the destination node can play back frames at the right time.

The only way to satisfy these requirements is for the `RTCRtpSenderEncodedSource` to support taking `RTCRtpEncodedVideoFrame`s so that this metadata can be forwarded to the next node in the path. WebCodecs encoded chunks do not have a mechanism to set any of this metadata (except for the timestamp).

The reason we need a setMetadata method is that, since the incoming frames come from multiple input PCs, it may be necessary to adjust some of the metadata of the output frame so that it properly reflects the decisions made by the forwarder. For example, frames with the same payload may have different frame IDs if they come from different servers. Thus, it must be possible to rewrite this ID for the output frame so that the next hop sees a consistent frame ID scheme.  

[1] First proposed in the [July meeting](https://lists.w3.org/Archives/Public/www-archive/2023Jul/att-0004/WEBRTCWG-2023-07-18.pdf#page=29)

-- 
GitHub Notification of comment by guidou
Please view or discuss this issue at https://github.com/w3c/webrtc-encoded-transform/issues/211#issuecomment-1831937358 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Wednesday, 29 November 2023 13:52:47 UTC