[webrtc-pc] API for codec performance (#2241)

chcunningham has just created a new issue for https://github.com/w3c/webrtc-pc:

==  API for codec performance ==
Hi Group,

I work on the MediaCapabilities (MC) [spec](https://w3c.github.io/media-capabilities/) ([explainer](https://github.com/w3c/media-capabilities/blob/master/explainer.md)). I've had a few requests from WebRTC apps to expand this API to describe WebRTC encode/decode performance. The use cases make sense but I have some ergonomics concerns and I'd like to collab w/ RTC experts here to explore the options. 

**MediaCapabilities today**
- The primary interface is [decodingInfo()](https://w3c.github.io/media-capabilities/). This was designed to replace \<video\>.canPlayType(...). It describes "file" (foo.mp4) and "media-source"  (YouTube, Netflix, ...) [decoding types](https://w3c.github.io/media-capabilities/#mediadecodingtype). It does not include WebRTC. This is implemented in Chrome, Firefox, and Safari. 
- The spec also defines [encodingInfo()](https://w3c.github.io/media-capabilities/#dom-mediacapabilities-encodinginfo), which covers "recording" (MediaRecorder) and "transmission" (WebRTC) [encoding types](https://w3c.github.io/media-capabilities/#mediaencodingtype). This part of the spec is less mature and not shipped by any browser. The "recording" part is conceptually a simple reversal of file-decoding. The "transmission" (WebRTC) type seemed like a natural next step (MediaRecorder after all is ~part of WebRTC family). This interface is very roughly spec'ed; it needs some love (or maybe removal). 

**WebRTC use cases are very similar to those from the media playback world.** Apps would like to know before-hand what limits to set for resolution/framerate/bitrate such that the machine is able to maintain a buttery smooth (timely) user encoding/decoding experience (ignoring the small matter of network issues). WebRTC's reference implementation helps out by automatically adapting encode resolution if the CPU is over-used, but this requires the user to first have a bad experience before adapting back down (meanwhile, the camera may be left open at HD resolution, potentially wasting resources of a starving machine). 

Apps may also ask what codecs can be hardware accelerated to minimize battery drain.  

**Ergonomics: What shape should the API take? Where should it live?**
We took a closer look at implementing MC.encodingInfo() for "transmission" (WebRTC) encoding in Chrome, and a few things gave me pause. 

Most notably, WebRTC has existing capability APIs that seem like a natural place to describe codec performance. 
- The RTCRtpReceiver and RTCRtpSender inferfaces both define a getCapabilities() method, that returns a sequence of [RTCRtpCodecCapabilities](https://www.w3.org/TR/webrtc/#dom-rtcrtpcapabilities). Right away we see that RTC users can  ask about capabilities on both ends of the wire, not just the local machine. This is something MediaCapabilities can't do. 
- RTCRtpCodecCapability does not include performance info (e.g. max framerate, max resolution). But [the ORTC spec](http://draft.ortc.org/#codec-capability-parameters*) proposes new "codec capability parameters" for this object, including codec specific fields like "max-fs" (frame size) and "max-fr" (frame rate). Should this serve as a model to define similar in WebRTC?

Another issue we face is that WebRTC doesn't use the same mime-type codec strings from the \<video\> playback world (e.g. 'video/mp4; codecs="avc1.4d001e"' for \<video\> vs 'video/h264' in RTC). Its a little ugly to consider mixing these into MediaCapabilities and it smells like a hint that we're merging two worlds that might better be kept separate. 

Interested to get your thoughts! Thanks for reading. 


Please view or discuss this issue at https://github.com/w3c/webrtc-pc/issues/2241 using your GitHub account

Received on Tuesday, 30 July 2019 00:39:12 UTC