Re: Proposal: Media streams, media stream tracks and channels

On 11/10/2011 03:30 PM, Francois Daoust wrote:
> On 11/08/2011 01:15 AM, Harald Alvestrand wrote:
>> Discharging a task taken on at the TPAC meeting, some possible words 
>> on what a media stream, a media stream track or a channel is....
>>
>> This is based on the introduction section in section 3.1 of the 
>> current API editors' draft.
>>
>> The|MediaStream 
>> <http://dev.w3.org/2011/webrtc/editor/webrtc.html#mediastream>|interface 
>> is used to represent streams of media data, typically (but not 
>> necessarily) of audio and/or video content, e.g. from a local camera 
>> or a remote site. The data from a|MediaStream 
>> <http://dev.w3.org/2011/webrtc/editor/webrtc.html#mediastream>|object 
>> does not necessarily have a canonical binary form; for example, it 
>> could just be "the video currently coming from the user's video 
>> camera". This allows user agents to manipulate media streams in 
>> whatever fashion is most suitable on the user's platform.
>>
>> Each|MediaStream 
>> <http://dev.w3.org/2011/webrtc/editor/webrtc.html#mediastream>|object 
>> can represent zero or more tracks, in particular audio and video 
>> tracks. Tracks can contain multiple channels of parallel data; for 
>> example a single audio track could have nine channels of audio data 
>> to represent a 7.2 surround sound audio track.
>>
>> <new text below>
>>
>> All tracks in a MediaStream are presumed to be synchronized at some 
>> level. Different MediaStreams may or may not be synchronized.
>>
>> Each track represented by a|MediaStream 
>> <http://dev.w3.org/2011/webrtc/editor/webrtc.html#mediastream>|object 
>> has a corresponding|MediaStreamTrack 
>> <http://dev.w3.org/2011/webrtc/editor/webrtc.html#mediastreamtrack>|object.
>>
>> A MediaStreamTrack represents content comprising one or more 
>> channels, where the channels have a defined well known relationship 
>> to each other (such as a stereo or 5.1 audio signal), and may be 
>> encoded together for transmission as, for instance, an RTP payload type.
>
> To ensure I get things right...
>
> With this definition left and right channels of a stereo audio signal 
> can be equally represented as:
> 1) two channels within a MediaStreamTrack.
> 2) two MediaStreamTrack objects.

Well.... those two have different semantics.
In the first case, you know that they are left and right channels of a 
stereo signal.
In the second case, you require metadata (which is not part of the spec) 
to decide that this is the case; they might as well be the audio feed 
from the lectern mike and the room mike in a lecture hall.
>
> The first case is the most likely representation given the well-known 
> relationship between both channels in a stereo audio signal. The two 
> channels may or may not be encoded together for transmission.
>
> The second case is not the most likely representation but is still 
> possible. In this case, can the two MediaStreamTrack objects still be 
> encoded together for transmission? I would assume that it is not 
> possible. Would it make sense to clarify that a MediaStreamTrack 
> object is always encoded on its own for transmission (or is it 
> self-evident)?
That part of the spec hasn't been written yet, but I think that if a 
MediaStreamTrack is presented to a PeerConnection, it should be encoded 
on its own; this is, however, a property of the attachment of the 
MediaStreamTrack to the PeerConnection - one could imagine scenarios 
where the concept of "encoded for transmission" simply does not enter 
the picture.
>
> Francois.
>
>
>>
>> A channel is the smallest unit considered in this API specification.
>>
>> <end new text>
>>
>> Would including this text help add any clarity after our discussions 
>> at TPAC?
>>
>> Query: What other examples of embedded channels would be useful to 
>> add? Are there good examples of audio or non-audio data embedded in 
>> video tracks?
>>
>> Harald
>>
>>
>>
>
>

Received on Thursday, 10 November 2011 16:23:05 UTC