Re: [media] Moving forward with captions / subtitles from Eric Carlson on 2010-02-16 (public-html-a11y@w3.org from February 2010)

From: Eric Carlson <eric.carlson@apple.com>
Date: Mon, 15 Feb 2010 21:29:48 -0800
To: Philip Jägenstedt <philipj@opera.com>
Cc: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
Message-id: <1F57E298-8023-437A-A26D-357360FEF09A@apple.com>
On Feb 15, 2010, at 9:03 PM, Philip Jägenstedt wrote:

> On Tue, 16 Feb 2010 10:22:34 +0800, Eric Carlson <eric.carlson@apple.com> wrote:
> 
>> 
>>  Yikes, teach me to ignore email for 12 hours :-(
>> 
>> 
>> On Feb 15, 2010, at 12:46 AM, Philip Jägenstedt wrote:
>> 
>>> On Mon, 15 Feb 2010 15:19:09 +0800, Eric Carlson <eric.carlson@apple.com> wrote:
>>> 
>>>> 
>>>> On Feb 14, 2010, at 11:06 PM, Philip Jägenstedt wrote:
>>>>> 
>>>>> I think calling the grouping element <track> is a bad idea when it in fact doesn't specify a track but a group of tracks (each track in <source>).
>>>>> 
>>>> But it does not represent a group of tracks!
>>>> 
>>>> The <track> element represents a single track in the presentation, which uses one of the <source> elements as its source of media data.
>>> 
>>> How would this tie into the MediaTrack API and the MediaTracks collection? It is my understanding that each individual stream in a Ogg or MPEG-4 would be a MediaTrack.
>>> 
>> 
>>  Yes, exactly.
>> 
>>> Would a <track> or a <source> represent a MediaTrack? If it is <track>, how would one activate a single <source> via the MediaTracks collection? Or is the intention that source selection in <track> completely determine which <source> is used so that the only way of switching between e.g. languages is rearranging the order of <source>s and calling .load() (or similar)?
>>> 
>>  I am proposing that a <track> be represented by a MediaTrack. The UA would select one of the <source> elements, or the "src" attribute on the <track>, and that file would be used as the track's media data.
>> 
>>  As you note, this *is* different from "alternate tracks" in an MPEG-4 or QuickTime file, but it is different by design. If we represent each <source> by a MediaTrack object we will need to load every source, whether it is displayed or not, to answer questions about it. The MediaTrack object in the multi track API proposal has an ellipsis after "enabled" to represent the other track properties we will want to expose:
>> 
>> interface MediaTrack {
>>  readonly attribute DOMString name;
>>  readonly attribute DOMString role;
>>  readonly attribute DOMString type;
>>  readonly attribute DOMString lang;
>>           attribute boolean enabled;
>>  ...
>> };
>> 
>>  Some of these properties won't be possible to answer without loading and parsing a file (eg. duration), which we shouldn't require for a file that won't be used.
>> 
>>  MPEG-4 and QuickTime files don't have this problem because even if a track's media is external to the movie, the movie file always contains the track meta data so it is possible get it without loading/parsing the track data.
> 
> Good point. My thinking is that attributes of MediaTrack that require loading the track would simply be unavailable when the track is not enabled, like e.g. HTMLMediaElement.duration. At least role, type and lang are available from markup though and should be what the "track selection algorithm" operates on.
> 
  One problem with this is that tracks inside of a media file won't have this restriction. Actually, track added in markup and disabled after the data is loaded won't have this restriction either. This is likely to be very confusing.

> With <track><source>, is it at all possible to use the MediaTracks collection to activate tracks or build scripted menus? While not a must-have feature, it would be nice if the same API can be used to operate on both resource-internal tracks and tracks added with markup.
> 
  Yes, I think it is very important that internal and external tracks are represented in exactly the same way. An object in the MediaTracks collection represents the <source> chosen by the resource selection algorithm. In your complex example from earlier, assuming "video.ogv" has one video and one audio track:

    <video src="video.ogv">
        <track role="SUB">
            <source src="subs.en.srt" srclang="en">
            <source src="subs.sv.srt" srclang="sv">
        </track>
        <track role="CC">
            <source src="cc.en.srt" srclang="en">
            <source src="cc.sv.srt" srclang="sv">
        </track>
    </video>

  Every user would have :

	video.tracks(0).role == 'video' 
	video.tracks(1). role == 'audio' 
	video.tracks(2). role == 'sub' 
	video.tracks(3). role == 'cc'
	video.tracks(0).src == 'video.ogv' 	// the media is in the movie file
	video.tracks(1).src == 'video.ogv' 

  But only users on a Swedish system would have (assuming the first language is chosen if none match the user's system): 

	video.tracks(2).src == 'subs.sv.srt' 
	video.tracks(3).src == 'cc.sv.srt'

  Disabling *any* track is just "video.tracks(n).enabled = false".

eric
Received on Tuesday, 16 February 2010 05:30:23 UTC