Streams, ICE, and signaling

The existing specification indciates that MediaStreams have a 1:1
relationship with ICE transports, as evidenced by the mechanism of signaling
stream removal by sending an ICE candidate with port=0. Given the desire to
multiplex multiple sources and content types over a single RTP session and
ICE transport, this association is limiting. While ICE exists as an
underlying transport protocol, and requires information that must be
exchanged in signaling, higher level functions such as addition or removal
of streams should not affect ICE directly. These occurrences may require ICE
to establish additional connections, if multiplexing is not enabled, but
that detail does not need to be exposed from the API. Below I propose a
different way of handling streams in PeerConnection.

As a background for my proposal, the following associations are assumed
(paraphrasing a similar ontology from Harald):
- PeerConnection holds 1-N MediaStreams, each containing 1-N MediaTracks.
- Tracks within a MediaStream are synchronized, and therefore share a single
RTCP CNAME when sent as RTP.
- MediaStreams therefore have a 1:1 association with CNAMEs. This CNAME may
be exposed to the application, to let it identify streams.
- MediaTracks are identified by 1-N SSRCs; this number is typically 1, but
may be larger in the cases where SSRC grouping, as outlined in RFC 5576, is
employed.
- Depending on whether RTP muxing is enabled or not, PeerConnection will use
1-N RTP sessions. This is an internal detail, and does not need to be
exposed to the application.
- RTP sessions are always sendrecv; individual sources are, naturally,
sendonly.

>From these associations, we can say the following:
- A MediaStream's label directly corresponds to the CNAME used for its
tracks. This CNAME should follow the short-term CNAME rules from RFC 6222.
- Addition of a MediaStream, or MediaTrack to a MediaStream, causes new SDP
to be generated according to RFC 5576, indicating the SSRC and CNAME of the
new track.
- Removal of a MediaStream results in a similar new SDP.
- Demux of incoming RTP into the appropriate tracks is done by SSRC, and the
associations are known due to the above signaling.
- This mechanism works identically regardless of whether we have separate
RTP sessions for each content type, or 1 single multiplexing session.

This mechanism not only simplifies multiplexing audio and video over a
single RTP session, but it also cleanly allows for multiplexing multiple
participants (i.e. in a centralized conferencing scenario).

I also think we should remove the notion of "ICE Agents" from the spec, at
least in regards to stream processing, because ICE is no longer involved in
these activities. It probably makes more sense to simply refer to the
PeerConnection itself.
e.g. changing
"When a PeerConnection ICE Agent finds that a stream from the remote peer
has been removed (its port has been set to zero in a media description sent
on the signaling channel), the user agent must follow these steps:"
to
"When a PeerConnection finds that a stream from the remote peer has been
removed (it has received a signaling message that no longer contains the
CNAME that identifies this stream), the user agent must follow these
steps:"

Thoughts?

Received on Friday, 2 September 2011 21:51:57 UTC