Re: [webrtc-extensions] Add RTCRtpEncodedSource and explainer (#198) from guidou via GitHub on 2024-03-18 (public-webrtc-logs@w3.org from March 2024)

From: guidou via GitHub <sysbot+gh@w3.org>
Date: Mon, 18 Mar 2024 13:48:20 +0000
To: public-webrtc-logs@w3.org
Message-ID: <issue_comment.created-2003963757-1710769697-sysbot+gh@w3.org>
> My main thought is that this API requires to wait for all packets of a frame, which creates some latency. This solution does not allow to do what an SFU is doing with the same level of performance. It might be a good enough compromise. If we already consider supporting packet forwarding, maybe that is what we should do instead.

This is a good compromise for the use case of glitch-free forwarding with multiple input peer-connections with failover. In this case, frames provide a convenient abstraction to do failover quickly (no need for timeouts, just forward a frame from the first peer connection that provides it). It is also ergonomic for some other SFU-like operations where the outcome is frame-based (e.g., drop frames that don't satisfy a certain property).

There is also an effort going on for a packet-level API, although we have not heard yet about developers interested in that API for the use case of glitch-free forwarding using multiple input peer connections. There are other use cases driving the design of that API.

> 
> API wise, transferring seems harder than using what we have done for RTCRtpScriptTransform. We should consider the pros and cons of both approaches. For instance, the current proposal is allowing to transfer to cross agent cluster dedicated workers, which is not allowed with RTCRtpScriptTransform.  I see that more as an issue than a feature. I also think that an API where you create the source where you use it (instead of creating in worker and then transferring) is slightly easier to use.

We actually only need to transfer to workers within the same agent cluster, so maybe a clarification is needed in the text of the proposed spec. However, I do agree with you that it is better to be able to create the source where you use it.

The main reasons the proposal exposes RTCRtpSenderEncodedSource only on DedicatedWorker are:
* [Your proposal](https://github.com/w3c/webrtc-encoded-transform/issues/211#issuecomment-1777458119) on which this is text is based exposes it only on workers  
* In other similar APIs we have always had agreement about exposing on DedicatedWorker, and disagreement about exposing on Window. So, I thought exposing only on DedicatedWorker would lead to faster consensus, while the larger discussion about exposing things on Window could continue separately.

> From an implementation point of view, it would also be simpler (at least in Safari) compared to supporting transferability From a spec point of view, it would also be simpler since we would not have to deal with when you can transfer and when you cannot.

I agree with you on this point. The idea was to deviate as little as possible from https://github.com/w3c/webrtc-encoded-transform/issues/211#issuecomment-1777458119 in order to make it easier to achieve consensus, but if you agree about exposing RTCRtpSenderEncodedSource on Window, then we can simplify this by removing transferrability.

> 
> It is not clear why RTCRtpSenderEncodedSource is not providing a WritableStream like done for RTCRtpScriptTransform.

I agree that a WritableStream would provide more flexibility here. Again, the reason was to minimize deviations from https://github.com/w3c/webrtc-encoded-transform/issues/211#issuecomment-1777458119

If I understand correctly, your  concerns can be summarized as follows:
* The extra latency a frame-based API has compared with a potential packet-based API
* RTCRtpSenderEncodedSource is exposed only on DedicatedWorker. 
* RTCRtpSenderEncodedSourceHandle is transferrable. 
* RTCRtpSenderEncodedSource does not expose a WritableStream to write frames. 

We have already presented arguments in favor of the frame-based API and in previous discussions we have concluded that while latency may be a small disadvantage in some cases, frame-based has some clear advantages, in particular for the scenario of forwarding with glitch-free failover over multiple input peer connections.

The concerns about the spec text I think can be addressed as follows:
* Expose RTCRtpSenderEncodedSource on Window.
* Make RTCRtpSenderEncodedSourceHandle not transferrable, or maybe just remove it and use the source directly.
* Expose a WritableStream instead of an `enqueue()` method.

WDYT?


-- 
GitHub Notification of comment by guidou
Please view or discuss this issue at https://github.com/w3c/webrtc-extensions/pull/198#issuecomment-2003963757 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config
Received on Monday, 18 March 2024 13:48:21 UTC