Re: Tech Discussions on the Multitrack Media (issue-152) from Eric Carlson on 2011-02-11 (public-html@w3.org from February 2011)

From: Eric Carlson <eric.carlson@apple.com>
Date: Fri, 11 Feb 2011 07:59:36 -0800
To: Maciej Stachowiak <mjs@apple.com>
Cc: public-html <public-html@w3.org>, Silvia Pfeiffer <silviapfeiffer1@gmail.com>, Mark Watson <watsonm@netflix.com>
Message-id: <1DD00658-5F5D-4F68-89A3-6117F72BDF14@apple.com>
On Feb 10, 2011, at 12:20 PM, Maciej Stachowiak wrote:

> 
> On Feb 10, 2011, at 11:26 AM, Eric Carlson wrote:
> 
>> 
>>   I agree with Mark, we need to make it possible for script to discover and configure non-text tracks internal to the media resource. 
>> 
>>   The current <track> API allows this for in-band data that "the user agent recognises and supports as being equivalent to a text track" [1], so I think we should extend <track> to support other media types instead of creating a new mechanism or element type. This can be done with a combination of options 2 and 3 - generalizing <track> to allow the inclusion of external audio, video, and accomodating multiple media formats and configurations with <source> elements as we do for <audio> and <video>.
> 
> It seems like option 7 could also work with this model, since auxiliary media resources are still referenced via a <track> element.
> 
  Option 7 will work for including alternate audio and video encodings, but it won't help with alternate caption file formats. We could pull in external audio and video media from other media elements and use <source> elements inside of <track> to support multiple caption formats, but that is inconsistent and I think it will be confusing for authors.

eric


> 
>> 
>>   Here is the example from the multi-track wiki page with multiple formats for the audio description and sign language tracks:
>> 
>>    <video id="v1" poster=“video.png” controls>
>>        <!-- primary content -->
>>        <source src=“video.webm” type=”video/webm”>
>>        <source src=“video.mp4” type=”video/mp4”>
>> 
>>        <!-- pre-recorded audio descriptions -->
>>        <track kind="descriptions" type="audio/ogg" srclang="en" label="English Audio Description">
>>            <source src="audesc.ogg" type="audio/ogg">
>>            <source src="audesc.mp3" type="audio/mpeg">
>>        </track>
>> 
>>        <!-- sign language overlay -->
>>        <track kind="signings" type="video/webm" srclang="asl" label="American Sign Language">
>>            <source src="signlang.webm" type="video/webm">
>>            <source src="signlang.mp4" type="video/mp4">
>>        </track>
>>    </video>
>> 
>>   Allowing <source> inside of <track> also makes it possible to include alternate caption formats, eg:
>> 
>>   <video id="v1" poster=“video.png” controls>
>>        <!-- primary content -->
>>        <source src=“video.webm” type=”video/webm”>
>>        <source src=“video.mp4” type=”video/mp4”>
>> 
>>        <!-- captions -->
>>        <track kind="captions" type="audio/ogg" srclang="en" label="Captions">
>>            <source src="captions.vtt" type="text/vtt">
>>            <source src="captions.xml" type="application/ttml+xml">
>>        </track>
>>    </video>
>> 
>>   Unlike option 3 this does not require new interfaces, but it will probably require a new attribute on <track> so it is possible to determine the media type. It will also require a "currentSrc" attribute so it is possible to determine which source was chosen.
>> 
>>   I will add this option to the wiki.
>> 
>>   I also think that it would be useful to be able to synchronize multiple media elements in a page, but I see this as an additional requirement. Option 6 allows separate media elements to be synchronized, but it does not allow the discovery and configuration of in-band audio and video tracks. It will, however, work with the option I have outlined above.
>> 
>> eric
>> 
>> [1] http://www.w3.org/TR/html5/video.html#sourcing-in-band-text-tracks
>> 
>> 
>> On Feb 10, 2011, at 5:47 AM, Mark Watson wrote:
>> 
>>> Hi everyone,
>>> 
>>> I have a couple of comments on this proposal, but first, since this is my first post to this list I should introduce myself. I am representing Netflix - we joined W3C just this week. We are interested in ensuring that a streaming service like ours could in future be supported by HTML5.
>>> 
>>> One thing we are interested in is support for multiple languages, for both audio and text tracks and therefore a Javascript API to discover and select amongst those tracks.
>>> 
>>> We use a form of HTTP adaptive streaming, which if translated to HTML5 would mean providing a URL for a manifest to the <video> element as in Option (1) on the wiki page. But there is also another case where there are multiple tracks and no HTML markup: when there are multiple tracks inside a single ordinary media file e.g. an mp4 file with multiple audio language tracks.
>>> 
>>> The distinction between TextTrack and MediaTrack in the API under option (1) seems strange to me. Text is just another kind of media, so shouldn't the kind for each track be ( Audio | Video | Text ) rather than ( Media | Text ) where Media = ( Audio | Video ) ? [This is how it is framed in option 3, albeit up one level].
>>> 
>>> I don't have a strong opinion on the markup aspect, but I think the first side-condition is important (that the API be the same whether the tracks come from explicit markup or are present within a multiplexed file or described by a manifest). If I read rightly this condition is not met in (3), (4) and (6) right ?
>>> 
>>> Best,
>>> 
>>> Mark Watson
>>> watsonm@netflix.com
>>> On Feb 9, 2011, at 11:56 PM, Silvia Pfeiffer wrote:
>>> 
>>>> Everyone,
>>>> 
>>>> Your input on this is requested.
>>>> 
>>>> Issue-152 is asking for change proposals for a solution for media
>>>> resources that have more than just one audio and one video track
>>>> associated with them. The spec addresses this need for text tracks
>>>> such as captions and subtitles only [1]. But we haven't solved this
>>>> problem for additional audio and video tracks such as audio
>>>> descriptions, sign language video, and dubbed audio tracks.
>>>> 
>>>> In the accessibility task force we have discussed different options
>>>> over the last months. However, the number of people that provide
>>>> technical input on issues related to media in the TF is fairly
>>>> limited, so we have decided to use the available time until a change
>>>> proposal for issue-152 is due (21st February [2]) to open the
>>>> discussion to the larger HTML working group with the hope of hearing
>>>> more opinions.
>>>> 
>>>> Past accessibility task force discussions [3][4] have exposed a number
>>>> of possible markup/API solutions.
>>>> 
>>>> The different approaches are listed at
>>>> http://www.w3.org/WAI/PF/HTML/wiki/Media_Multitrack_Media_API . This
>>>> may be an incomplete list, but it's a start. If you have any better
>>>> ideas, do speak up.
>>>> 
>>>> Which approach do people favor and why?
>>>> 
>>>> Cheers,
>>>> Silvia.
>>>> 
>>>> [1] http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#the-track-element
>>>> [2] http://lists.w3.org/Archives/Public/public-html/2011Jan/0198.html
>>>> [3] http://lists.w3.org/Archives/Public/public-html-a11y/2010Oct/0520.html
>>>> [4] http://lists.w3.org/Archives/Public/public-html-a11y/2011Feb/0057.html
>>>> 
>>>> 
>>> 
>> 
>
Received on Friday, 11 February 2011 16:00:12 UTC