W3C home > Mailing lists > Public > public-html-a11y@w3.org > February 2010

Re: [media] Moving forward with captions / subtitles

From: Eric Carlson <eric.carlson@apple.com>
Date: Sat, 13 Feb 2010 08:42:50 -0800
Cc: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
Message-id: <E9D0837C-5724-4E58-926B-92115BC6C678@apple.com>
To: Philip Jägenstedt <philipj@opera.com>

On Feb 13, 2010, at 7:01 AM, Philip Jägenstedt wrote:

> On Sat, 13 Feb 2010 21:04:36 +0800, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:
> 
>> 
>> On Sat, Feb 13, 2010 at 9:19 PM, Philip Jägenstedt <philipj@opera.com> wrote:
>>> 
>>> I think the main outstanding problem is still a good name for the grouping
>>> element. <textassoc> isn't great either because it's a bit difficult to
>>> spell. Perhaps <track>? Even though several browser vendors are skeptical of
>>> syncing audio/video from two different resources, it would make it spec-wise
>>> possible to allow it in the future. For now, it's text-only though:
>>> 
>>> <video src="video.ogg">
>>> <track src="captions.srt">
>>> </video>
>>> 
>>> or using <source> in the same way as for <video>:
>>> 
>>> <video src="video.ogg">
>>> <track>
>>>   <source type="text/srt" src="captions.srt" lang="en">
>>>   <source type="text/srt" src="zimu.srt lang="zh">
>>> </track>
>>> </video>
>>> 
>>> Note that the resource selection algorithm is not a limitation here, because
>>> we can freely define how <track>s are activated and how to select between
>>> alternative <source>s in a <track>. Perhaps we need a new boolean attribute
>>> like enabled="" to enable a track by default.
>> 
>> In general I like the idea of calling it <track>. However, I have a
>> slight issue with it because they are only virtual tracks - normally
>> only the "tracks" that are multiplexed together inside a encapsulation
>> format are called tracks. This would make the content inside a source
>> element called tracks, but also the parallel external files. I predict
>> confusion.
> 
> I think it would be good to treat them as the same as far as possible, including in the DOM API and MediaTracks collection. That way the same user JavaScript could operate on the resource without caring if the tracks are resource-internal or added using <track>.
> 

  I agree that external track references should be treated the same as tracks in the media file. There is no distinction in the final presentation to the user, so I think they should also be treated the same by the UA when choosing among alternates and when presenting the DOM API. 

  I like the idea of reusing <source> to list alternate tracks. We should also include the "media" attribute, it would help in defining the selection criteria. Can we use "media" instead of adding a "lang" attribute?


>> However, I must say I really like the idea of making it independent of
>> "text", i.e. leaving the possibility open to add "tracks" of audio or
>> video in future.
>> 
>> I'd be happy for something that essentially means "external parallel track".
> 
> Considering how many different names we have already come up with, I doubt <track> will be the last :) Brainstorm away!
> 
  To me the singular "track" implies that only one <source> will be chosen which will not be true in all cases. Maybe <tracks>?


>>> role="" is fine, but I'd like to see more ideas on what UAs should to with
>>> it.
>> 
>> The thought is to use it not just for captions, subtitles, and textual
>> audio descriptions, but also for karaoke, lyrics, chapters, timed
>> comments, timed metadata, and other such time-aligned text and
>> annotations. There are examples with lyrics
>> (http://svg-wow.org/audio/animated-lyrics.html, and
>> http://annodex.net/~silvia/itext/chocolate_rain.html), and chapters
>> (http://annodex.net/~silvia/itext/elephant_no_skin_v2.html). I'm sure
>> we will come up with more similar examples.
> 
> Yes, but is it expected that the UA should do something with the attribute, like make context menus based on it? Or should it be part of the track selection algorithm? (Where "track selection algorithm" does not exist yet, but is what will select which tracks are enabled by default based on... language and such?)
> 
  I think the selection of alternates is an important point. Some media container formats (eg. QuickTime and MPEG-4)  allow an author to mark tracks as begin part of an "alternate group". This instructs the media engine to enable only one track in the group based on a condition on the user's machine when the file is opened for playback. For example, a movie can have subtitle tracks and chapter tracks in multiple languages, but only one of each is rendered when the movie plays. 

  We need to support this use case with external "tracks", and we need to define the selection algorithm when a file has both internal and external tracks.

  We also need to define a mechanism to mark tracks as being part of an alternate group. Is an attribute on <source> enough?

    <tracks>
        <source type="text/srt" src="en-captions.srt" lang="en" role="caption">
        <source type="text/srt" src="zh-captions.srt" lang="zh" role="caption">
    
        <source type="text/srt" src="en-chapters.srt" lang="en" role="chapters">
        <source type="text/srt" src="zh-chapters.srt" lang="zh" role="chapters">
    </tracks>
    
Or should we have a grouping element like Silvia had in her early proposal?

    <tracks>
        <track role="caption">
            <source type="text/srt" src="en-captions.srt" lang="en">
            <source type="text/srt" src="zh-captions.srt" lang="zh">
        </track>
    
        <track role="chapters">
            <source type="text/srt" src="en-chapters.srt" lang="en">
            <source type="text/srt" src="zh-chapters.srt" lang="zh">
        </track>
    </tracks>

  I hesitate to define yet another element, but I think the markup in a complex case like Silvia's Elephants Dream sample, http://annodex.net/~silvia/itext/elephant_no_skin_v2.html, is clearer because of it. On the other hand, will complex cases like this be common enough that we need it?

eric
Received on Saturday, 13 February 2010 16:43:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 04:42:02 GMT