Re: Some thoughts on RTCRtpParameters, RTCRtpEncodingParameters, RTCRtpCodecParameters, RTCRtpCodec, and RTCRtpCapabilities

This is definitely the right direction to me. It offers lots of flexible 
expansion and gives the ability to provide extended information about 
properties to help clients expend upon how properties are allowed to be 
used in a generic fashion with the meta data about the property.

The receiverId is vital to give a stable id for SSRC since they can 
change and we do want to avoid the unhandled SSRC case as much as 
possible. Since the SSRC can be per layer, this makes sense the receiver 
ID might be as well to help have proper demux rules for incoming packets 
without firing unhandled SSRC on layers which aren't even full tracks.

As for the "encodingId" (aka layerId) being a string, that makes sense 
to me. This also implies there could be "partial" rtp receivers (and 
senders), which do not emit a full track but are part of the layering 
information only since some of the layered packets are sent via an 
entirely different transport. Of course, it's totally optional that any 
implementation / engine supports such crazy scenarios but the API 
becomes flexible enough to support it.

E.g. of this is http://tools.ietf.org/html/rfc6190 section 7.3.3.

       a=group:DDP L1 L2 L3
       m=video 20000 RTP/AVP 96 97 98
       a=rtpmap:96 H264/90000
       a=fmtp:96 profile-level-id=4de00a; packetization-mode=0;
        mst-mode=NI-T; sprop-parameter-sets={sps0},{pps0};
       a=rtpmap:97 H264/90000
       a=fmtp:97 profile-level-id=4de00a; packetization-mode=1;
        mst-mode=NI-TC; sprop-parameter-sets={sps0},{pps0};
       a=rtpmap:98 H264/90000
       a=fmtp:98 profile-level-id=4de00a; packetization-mode=2;
        mst-mode=I-C; init-buf-time=156320;
        sprop-parameter-sets={sps0},{pps0};
       a=mid:L1
       m=video 20002 RTP/AVP 99 100
       a=rtpmap:99 H264-SVC/90000
       a=fmtp:99 profile-level-id=53000c; packetization-mode=1;
        mst-mode=NI-T; sprop-parameter-sets={sps1},{pps1};
       a=rtpmap:100 H264-SVC/90000
       a=fmtp:100 profile-level-id=53000c; packetization-mode=2;
        mst-mode=I-C; sprop-parameter-sets={sps1},{pps1};
       a=mid:L2
       a=depend:99 lay L1:96,97; 100 lay L1:98
       m=video 20004 RTP/AVP 101
       a=rtpmap:101 H264-SVC/90000
       a=fmtp:101 profile-level-id=53001F; packetization-mode=1;
        mst-mode=NI-T; sprop-parameter-sets={sps2},{pps2};
       a=mid:L3
       a=depend:101 lay L1:96,97 L2:99


^^ this is a bit nutty for most situations, but entirely possible to define.

-Robin


> Bernard Aboba <mailto:Bernard.Aboba@microsoft.com>
> February 25, 2014 at 7:15 PM
> Looking at the proposals made relating to capabilities and parameters, 
> some thoughts have come to mind on how they might be stitched together.
>
> This post refers to the following previous proposals:
>
> A Big Proposal: 
> http://lists.w3.org/Archives/Public/public-orca/2014Feb/0036.html
> Proposal: A more full RTCRtpParameters (RTCRtpCodecParams, RTX, FEC, 
> SSRCs, and the beginnings of simulcast):
> http://lists.w3.org/Archives/Public/public-orca/2014Feb/0021.html
> Proposal RtpListener for unsignalled ssrcs: 
> http://lists.w3.org/Archives/Public/public-orca/2014Feb/0022.html
>
>
> CAPABILITIES VERSUS CONFIGURATION
> --------------------------------------------------------
>
> One question relates to what capabilities are codec-specific versus 
> general capabilities of the underlying RTP stack.
>
> Looking at RFC 5104 Section 7.1, RTCP Feedback message support is 
> negotiated for a given payload type, suggesting that that a given 
> codec implementation may have the capability to support a given set of 
> feedback messages, and also can be configured to send and/or receive them.
>
> Somewhat less clearly, looking at draft-even-mmusic-application-token 
> it would appear that an AppId might be configured to be used with a 
> given codec, or even a layer (simulcast or layered coding) within a 
> codec. Therefore it might be useful not only to discover if an 
> implementation supports appId, but also to be able to configure its 
> use within a codec or a layer of a codec.
>
> Another question relates to how to express layer dependencies.
>
> Looking at RFC 5583 Section 6.5 as well as RFC 6190 Section 7.3 
> examples, it appears that dependencies can exist not only within a 
> single RtpReceiver and RtpSender object, but between objects. As an 
> example, in Multi-Session-Transport as shown in RFC 6190 Section 7.3.3 
> a dependency could exist between two RtpReceiver objects.
>
> As a result, it would appear that the layerId might be better 
> described in the form of a DOMString than an int, so that it could be 
> referenced across objects in layerDependencies.
>
> In addition, it would appear to me that the recv-appId would be useful 
> to include within the RTCRtpEncodingParameters along with the 
> (nullable) SSRC. IMHO, it is desirable to be able to avoid triggering 
> an RTCRtpUnhandledRtpEvent if possible. One circumstance where this 
> should not be needed is when configuring an audio codec where only a 
> single stream is expected (e.g. from a mixer). In that circumstance, 
> it could be desirable to configure only the Payload Type, leaving out 
> the SSRC. That way, the RtpReceiver object would handle the incoming 
> RTP stream from the start, regardless of what SSRC was selected, 
> without the need for an RTCRtpUnhandledRtpEvent.
>
> In a video scenario such as reception of simulcast, leaving the SSRC 
> out might also be useful, since the desire might be to allow the mixer 
> to switch between resolutions without signaling. In such a situation, 
> the recv-appId might be configured on the RtpReceiver object to 
> identify the potential simulcast streams that could be received (only 
> one at a time), but there would be no need to configure an SSRC (this 
> would be configured automatically in the RtpReceiver object via 
> dynamic binding of the recv-AppId to the SSRC,
>
> Also, in thinking about the format parameters that could be discovered 
> or configured on a codec or a layer, it seems like we might be able to 
> simplify things by thinking of them as properties with potential 
> meta-data. As an example, a given H.264/AVC implementation might 
> support the profile-level-id format parameter. However, beyond 
> discovering that the parameter is supported, it would be useful to 
> understand what values are supported within the implementation. So 
> there is some additional information (e.g. metadata)
> that it would be useful to have associated with a given format 
> parameter. In the proposal below, these are called "properties".
>
> Finally, in thinking of the various kinds of RTP capabilities which 
> might be discovered (e.g. header extensions, feedback messages, RTCP 
> report types, etc.) it seems like it be useful to use a more generic 
> mechanism, rather than creating separate buckets for each type of RTP 
> capability.
>
> Below are some ideas of how things might stitch together.
>
> --------------------------
> DISCOVERY
> --------------------------
>
> //The RTCRtpCapabilities object enables discovery of the supported 
> audio and video codecs as well as codec-specific features, along with 
> generic features of the RTP stack, including features that can be 
> supported within layers (such as layer-specific FEC/RTX).
>
> dictionary RTCRtpCapabilities {
> sequence<RTCRtpCodec> audioCodecs;
> sequence<RTCRtpCodec> videoCodecs;
> sequence<Property> rtpFeatures; // header extension, rtp features, 
> rtcp reporting + feedback mechanisms supported by engine
> sequence<Property> extendedEncodingParamaters; // properties supported 
> in layers
> };
>
>
> --------------------------------
> Note: Because rtpFeatures encompasses header extensions, rtp features, 
> rtcp reporting, feedback mechanisms, etc. it doesn't look like we need 
> RTCRtpFeatures any longer, previously defined as:
>
> enum RTCRtpFeatures {
> "nack"
> };
>
>
> --------------------------
> SENDER / RECEIVER
> --------------------------
>
> //The RTCRtpParameters object describes the codecs (and codec-specific 
> features), RTP features and encodings utilized by a given RtpReceiver 
> or RtpSender object.
>
> dictionary RTCRtpParameters {
> sequence<RTCRtpCodecParameters> codecs;
> sequence<RTCRtpEncodingParameters> encodings;
> sequence<Property> rtpFeatures; // applied to entire sender/receiver 
> object, e.g. header extensions, rtcp reporting + feedback mechanisms
> };
>
> //The RTCRtpEncodingParameters object
>
> dictionary RTCRtpEncodingParameters {
> bool active;
>
> DOMString? encodingId; // could have the same uniqueEncodingId for two 
> objects if they are referenced in an "or" dependency as defined in RFC 
> 5583 Section 6.5a
>
> sequence< DOMString >? dependencyEncodingIds; // Just the IDs (resolve 
> to encoding Ids within the same sequence first, then search globally 
> for matches)
>
> DOMString? receiverId; // the application ID configured for use. In an 
> RtpReceiver object, this corresponds to recv-appId; in an RTPSender 
> object, it corresponds to the appId from 
> draft-even-mmusic-application-token.
>
> // Note that an SSRC can change due to SSRC conflict but the 
> recv-AppId will remain constant, avoiding a potential hickup.
> // Also note that the sender can stop sending the appId once it knows 
> that the recv-AppId/SSRC binding has been achieved (e.g. when an RTCP 
> RR is received for the SSRC).
>
> // Note that the SSRC is nullable to allow configuration of an 
> RtpReceiver object that could receive all SSRCs for a given Payload 
> Type or would only configure the recv-appId.
>
> unsigned int? ssrc;
>
> DOMString? codecName;
>
> double priority;
> double maxBitrate;
> double maxQuality;
> double minQuality;
>
> double scale;
>
> double bias; // 0.0 = strongly favor "resolution" or 1.0 = strongly 
> favor "framerate", 0.5 = neither
>
> sequence<Property> extendedParameters; // properties applied to the 
> layer (e.g. fec, rtx, header extensions specific for the layer, rtcp 
> report + feedback mechanisms for the layer)
> };
>
> //The RTCRtpCodecParameters object binds a codec to a PayloadType.
>
> dictionary RTCRtpCodecParameters {
> unsigned byte payloadType;
> RTCRtpCodec codec;
> };
>
> RTCRtpCodec
> {
> string name;
> unsigned int? clockRate;
> unsigned int? numChannels;
> sequence<Property>? formats; // properties specific to the codec
> sequence<Property>? rtpFeatures; // codec specific rtp features, 
> example, header extensions for the codec, rtp/rtcp report + feedback 
> mechanisms specific to the codec
> }
>
>
> //A property can be something that would be signaled, or just 
> something that is configured but not signaled.
> //In addition to the property name and a value (of any type), there 
> can be metaData associated with the property.
>
>
> Property
> {
> bool isSignaled;
> DOMString name;
> any? value;
> PropertyMetaData? metaData;
> }
>
> PropertyMetaData
> {
> sequence<Attributes> attributes;
> }
>
> Attribute
> {
> bool attributeSignaled;
> string namespace;
> any? value;
> }
>
>
> --------------------
> SDP Snippets
> --------------------
>
> >From RFC 5104 Section 7.1
>
> v=0
> o=alice 3203093520 3203093520 IN IP4 host.example.com
> s=Media with feedback
> t=0 0
> c=IN IP4 host.example.com
> m=audio 49170 RTP/AVPF 98
> a=rtpmap:98 H263-1998/90000
> a=rtcp-fb:98 nack pli
>
> Here nack and pli are configured for use with H.263-1998 (but would 
> not necessarily be used with another codec).
>
> >From draft-even-mmusic-application-token Section 3.1:
>
> a=group:BUNDLE m1 m2
> m=video 49200 RTP/AVP 97,98
> a=rtpmap:98 H264/90000
> a=mid:m1
> a=content:main
> a=rtpmap:97 rtx/90000
> a=fmtp:97 apt=98;rtx-time=3000
> a=appId:2
> a=appId:3
> m=video 49200 RTP/AVP 97,98
> a=rtpmap:98 H264/90000
> a=mid:m2
> a=content:alt
> a=rtpmap:97 rtx/90000
> a=fmtp:97 apt=98;rtx-time=3000
> a=appId:4
> a=appId:5
>
> In the above snippet, it would appear that appIds are being configured 
> for use with different payload types, however, it is not clear what 
> payload types each is being used with.
>
>

Received on Wednesday, 26 February 2014 02:04:39 UTC