- From: Harald Alvestrand <harald@alvestrand.no>
- Date: Fri, 06 Sep 2013 11:17:28 +0200
- To: public-webrtc@w3.org
On 09/06/2013 10:33 AM, Stefan Håkansson LK wrote: > On 2013-09-05 19:33, Martin Thomson wrote: >> There was a question about how to do simulcast on the call. Here's >> how it might be possible to do simulcast without additional API >> surface. >> >> 1. Acquire original stream containing one video track. >> 2. Clone the track and rescale it. >> 3. Assemble a new stream containing the original and the rescaled track. >> 4. Send the stream. >> 5. At the receiver, play the video stream. >> >> That's the user part, now for the under-the-covers stuff: > One use of simulcast I had in mind was for the usual multiparty, with > central node, conferencing case where the active speaker is shown in a > large video window (with thumbnail videos for others). > > For this case I think we already have the basic things needed: > > * each participant sends high and low resolution video of the same scene > (as you outline above) > * the central node forwards the low, or high resolution, version based > on active speaker decision > > What is missing is the possibility to stop sending the high (or low) > resolution video from the end-point to the central node if it is not > forwarded to anyone. This would basically be to save transmission, and > we would need pause/resume, but as others have pointed out a video track > can be disabled (which would lead to encoding blackness) which also > saves a lot of bits. > > To handle the case you describe below we would need to add some kind of > meta data to the track, but it does not seem that hard to do. If we want to allow simulcast to be implemented at application level, it seems to me that signalling which tracks should be disabled at the relay is also an application level issue, and doesn't need standardization. As long as the communicating participants have the identifiers they need to identify the tracks and streams involved (<msid...>), they can send metadata outside of any standardized interfaces. > >> I know we discussed the rendering of multiple video tracks in the >> past, but it's not possible to read the following documents and reach >> any sensible conclusions: >> http://dev.w3.org/2011/webrtc/editor/getusermedia.html >> http://www.w3.org/TR/html5/embedded-content-0.html#concept-media-load-resource >> >> What needs to happen in this case is to ensure that the two video >> tracks are folded together with the higher "quality" version being >> displayed and the lower "quality" version being used to fill in any >> gaps that might appear in the higher "quality" one. >> >> That depends on the <video> element being able to identify the tracks >> as being equivalent, and possibly being able to identify which is the >> higher quality. This is where something like the srcname proposal >> could be useful >> (http://tools.ietf.org/html/draft-westerlund-avtext-rtcp-sdes-srcname-02). I'm not sure srcname helps at all. We already know that the video tracks are in the same stream, and that both are enabled. If simulcast is implemented as an application-layer function, the user agent can have the interpretation that these are versions of the same stream. The completely manual, JS-driven approach is to wait for the "error" event described under "if the media data is corrupted", use that to disable the current video stream, and enable a backup stream. What we could do, hypothetically, to automate this is to suggest that the HTML5 media load algorithm be changed - add a step on video playback that says something like: "If multiple video streams are present in the resource, and the currently selected video stream does not provide data that allows a picture to be rendered at that time, the user agent may switch to the next enabled video stream in the resource for the duration of the lack of data". The unsolved problem here is to make sure the user agent picks the streams in the right order. >> >> The only missing piece is exposing metadata on tracks such that this >> behaviour is discoverable. Adding an attribute on tracks (srcname >> perhaps, arbaon), could provide a hook for triggering the folding >> behaviour I'm talking about. >> >> >
Received on Friday, 6 September 2013 09:17:58 UTC