RE: [media] Moving forward with captions / subtitles

"I agree that external track references should be treated the same as tracks in the media file. There is no distinction in the final presentation to the user, so I think they should also be treated the same by the UA when choosing among alternates and when presenting the DOM API."

Agreed, couldn't have said it better myself.



From: public-html-a11y-request@w3.org [mailto:public-html-a11y-request@w3.org] On Behalf Of Eric Carlson
Sent: Saturday, February 13, 2010 8:43 AM
To: Philip Jägenstedt
Cc: Silvia Pfeiffer; HTML Accessibility Task Force
Subject: Re: [media] Moving forward with captions / subtitles


On Feb 13, 2010, at 7:01 AM, Philip Jägenstedt wrote:


On Sat, 13 Feb 2010 21:04:36 +0800, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:



On Sat, Feb 13, 2010 at 9:19 PM, Philip Jägenstedt <philipj@opera.com> wrote:

I think the main outstanding problem is still a good name for the grouping
element. <textassoc> isn't great either because it's a bit difficult to
spell. Perhaps <track>? Even though several browser vendors are skeptical of
syncing audio/video from two different resources, it would make it spec-wise
possible to allow it in the future. For now, it's text-only though:

<video src="video.ogg">
<track src="captions.srt">
</video>

or using <source> in the same way as for <video>:

<video src="video.ogg">
<track>
  <source type="text/srt" src="captions.srt" lang="en">
  <source type="text/srt" src="zimu.srt lang="zh">
</track>
</video>

Note that the resource selection algorithm is not a limitation here, because
we can freely define how <track>s are activated and how to select between
alternative <source>s in a <track>. Perhaps we need a new boolean attribute
like enabled="" to enable a track by default.

In general I like the idea of calling it <track>. However, I have a
slight issue with it because they are only virtual tracks - normally
only the "tracks" that are multiplexed together inside a encapsulation
format are called tracks. This would make the content inside a source
element called tracks, but also the parallel external files. I predict
confusion.

I think it would be good to treat them as the same as far as possible, including in the DOM API and MediaTracks collection. That way the same user JavaScript could operate on the resource without caring if the tracks are resource-internal or added using <track>.

  I agree that external track references should be treated the same as tracks in the media file. There is no distinction in the final presentation to the user, so I think they should also be treated the same by the UA when choosing among alternates and when presenting the DOM API. 

  I like the idea of reusing <source> to list alternate tracks. We should also include the "media" attribute, it would help in defining the selection criteria. Can we use "media" instead of adding a "lang" attribute?


However, I must say I really like the idea of making it independent of
"text", i.e. leaving the possibility open to add "tracks" of audio or
video in future.

I'd be happy for something that essentially means "external parallel track".

Considering how many different names we have already come up with, I doubt <track> will be the last :) Brainstorm away!
  To me the singular "track" implies that only one <source> will be chosen which will not be true in all cases. Maybe <tracks>?


role="" is fine, but I'd like to see more ideas on what UAs should to with
it.

The thought is to use it not just for captions, subtitles, and textual
audio descriptions, but also for karaoke, lyrics, chapters, timed
comments, timed metadata, and other such time-aligned text and
annotations. There are examples with lyrics
(http://svg-wow.org/audio/animated-lyrics.html, and
http://annodex.net/~silvia/itext/chocolate_rain.html), and chapters
(http://annodex.net/~silvia/itext/elephant_no_skin_v2.html). I'm sure
we will come up with more similar examples.

Yes, but is it expected that the UA should do something with the attribute, like make context menus based on it? Or should it be part of the track selection algorithm? (Where "track selection algorithm" does not exist yet, but is what will select which tracks are enabled by default based on... language and such?)
  I think the selection of alternates is an important point. Some media container formats (eg. QuickTime and MPEG-4)  allow an author to mark tracks as begin part of an "alternate group". This instructs the media engine to enable only one track in the group based on a condition on the user's machine when the file is opened for playback. For example, a movie can have subtitle tracks and chapter tracks in multiple languages, but only one of each is rendered when the movie plays. 

  We need to support this use case with external "tracks", and we need to define the selection algorithm when a file has both internal and external tracks.

  We also need to define a mechanism to mark tracks as being part of an alternate group. Is an attribute on <source> enough?

    <tracks>
        <source type="text/srt" src="en-captions.srt" lang="en" role="caption">
        <source type="text/srt" src="zh-captions.srt" lang="zh" role="caption">
    
        <source type="text/srt" src="en-chapters.srt" lang="en" role="chapters">
        <source type="text/srt" src="zh-chapters.srt" lang="zh" role="chapters">
    </tracks>
    
Or should we have a grouping element like Silvia had in her early proposal?

    <tracks>
        <track role="caption">
            <source type="text/srt" src="en-captions.srt" lang="en">
            <source type="text/srt" src="zh-captions.srt" lang="zh">
        </track>
    
        <track role="chapters">
            <source type="text/srt" src="en-chapters.srt" lang="en">
            <source type="text/srt" src="zh-chapters.srt" lang="zh">
        </track>
    </tracks>

  I hesitate to define yet another element, but I think the markup in a complex case like Silvia's Elephants Dream sample, http://annodex.net/~silvia/itext/elephant_no_skin_v2.html, is clearer because of it. On the other hand, will complex cases like this be common enough that we need it?

eric

Received on Saturday, 13 February 2010 17:56:37 UTC