Re: Tech Discussions on the Multitrack Media (issue-152) from Mark Watson on 2011-02-20 (public-html@w3.org from February 2011)

From: Mark Watson <watsonm@netflix.com>
Date: Sun, 20 Feb 2011 10:11:49 -0800
To: Philip Jägenstedt <philipj@opera.com>
CC: "public-html@w3.org" <public-html@w3.org>
Message-ID: <C1C4D2D3-655E-4150-8ABB-15FE713C5596@netflix.com>


Sent from my iPhone

On Feb 20, 2011, at 8:51 AM, "Philip Jägenstedt" <philipj@opera.com> wrote:

> On Sat, 19 Feb 2011 23:35:13 +0100, John Foliot <jfoliot@stanford.edu>  
> wrote:
> 
>> Mark Watson wrote:
>>> 
>>> On Feb 18, 2011, at 2:08 AM, Philip Jägenstedt wrote:
>>>> 
>>>> I don't think we should spend much time making extra in-band video
>>> tracks
>>>> work more than barely, if at all, since the extra bandwidth needed to
>>> have
>>>> multiple in-band video tracks makes it quite unlikely the feature
>>> would be
>>>> used to any greater extent.
>>> 
>>> A track declared within an adaptive streaming manifest (e.g. a DASH
>>> manifest or take-your-pick of various proprietary adaptive streaming
>>> solutions) would be an in-band track but would only be fetched when
>>> actually needed.
>> 
>> This has been an interesting conversation.
>> 
>> Philip, I think we need to be careful about the assumption you made, as
>> from an accessibility best-practices perspective, ensuring all supporting
>> media (be it textual or binary) is best included as in-band content, for
>> the very same reason why providing textual (captioning) data in-band is
>> preferable: portability and re-use. Isn't this why we worked on getting
>> the JavaScript API ready early on?
>> 
>> While I concede that the inclusion of sign language interpretation and
>> descriptive audio may seem edge-case compared to the larger body of
>> content envisioned to be on the web, it is important that we ensure we  
>> can
>> do this, and do it both well and properly. Thus I think we need to spend
>> as much time as required to ensure we *have* met this requirement, and I
>> am a tad concerned that we suggest that content such as this *not* be
>> treated in the same way as textual supporting content.
> 
> Certainly, accessibility improvements in the form of extra audio and video  
> tracks is the main use case here, so failing to achieve that is not an  
> option. In those environments where extra video tracks for accessibility  
> have already been rolled out, is it really the case that all video tracks  
> are muxed in a single file (and sent over a slowish network)? Concrete  
> examples of the current state of the art would be much appreciated. It  
> seems to me that delivering all video tracks in a single file would be  
> waste bandwidth to the point that people just won't do it, except for  
> situations where a large majority of users are expected to use all of the  
> tracks.
> 
> My gut feeling is that most of the time people will want to use separate  
> video tracks to save bandwidth and have some control over how the video  
> tracks are laid out on the page. For this, using multiple <video> elements  
> is a perfect fit. Since we'll most certainly need a solution in that  
> general direction, it would be great if we could get away with *only* that  
> single solution. If that turns out to be impossible then so be it, we'll  
> see where we land after another 1000 mails or so.
> 
>> The idea that a DASH manifest would only fetch this type of content
>> 'on-demand' is intriguing; however does it not presume an active
>> connection to the network? Or would the DASH manifest also be used to
>> 'activate' or expose supporting in-band content such as sign language
>> content, etc. to the user-agent?
> 
> I assume this part was not directed at me, but the answer would interest  
> me.

I am not sure I understand the question. A DASH manifest describes all the media available for a presentation, which can include multiple tracks for language and accessibility and other needs as well as multiple encodings of each track for adaptation and compatibility.

The media player needs to be told which tracks to play and it will download actual media data only for those tracks.

We should not assume that the web peg author and manifest author are the same or that the manifest has been constructed especially for use in an HTML environment.

The web page ought to be able to find out what tracks are available - it cannot know in advance - and enable/disable them and control how they are rendered.

Discovering these inband tracks and enabling/disabling them is easily covered by the MultiTrack API (whatever version). Controlling how they are rendered seems to be the issue. Perhaps you need to create another video element whose src is somehow one of the inband tracks of the original one? Or using track elements.

My concern is just that you need to be able to attach the rendering instructions (CSS) *after* discovering the track as an inband track of the original media resource.

...Mark
> 
> -- 
> Philip Jägenstedt
> Core Developer
> Opera Software
> 
>
Received on Sunday, 20 February 2011 18:16:31 UTC