Re: [media] Moving forward with captions / subtitles

On 2/16/10 9:55 AM, "Philip Jägenstedt" <philipj@opera.com> wrote:

On Tue, 16 Feb 2010 20:11:50 +0800, Silvia Pfeiffer
<silviapfeiffer1@gmail.com> wrote:

> On Tue, Feb 16, 2010 at 7:19 PM, Philip Jägenstedt <philipj@opera.com>
> wrote:
>> On Tue, 16 Feb 2010 15:46:00 +0800, Silvia Pfeiffer
>> <silviapfeiffer1@gmail.com> wrote:
>>
>>> On Tue, Feb 16, 2010 at 6:37 PM, Philip Jägenstedt <philipj@opera.com>
>>> wrote:
>>>>
>>>> On Tue, 16 Feb 2010 04:36:09 +0800, Geoff Freed <geoff_freed@wgbh.org>
>>>> wrote:
>>>>
>>>>> GF:  I prefer <trackgroup><track> as well--  grouping tracks by role
>>>>> makes
>>>>> the most sense to me. But I'm still confused about one thing after
>>>>> reading
>>>>> today's thread.  From this markup, it looks to me like
>>>>> <trackgroup><track>
>>>>> also would permit multiple tracks of the same role to appear
>>>>> simultaneously.
>>>>>  True?  Playing simultaneous tracks of the same role is still what
>>>>> I'd
>>>>> prefer (in addition to playing simultaneous tracks of differing
>>>>> roles,
>>>>> of
>>>>> course).
>>>>
>>>> My idea is that <trackgroup> be used to group mutually exclusive
>>>> tracks,
>>>> independently of their roles. I struggle to come up with an example
>>>> when
>>>> you
>>>> would want it, but if you wrap each <track> in its own <trackgroup>
>>>> then
>>>> *all* tracks can be enabled simultaneously. It is of course up to the
>>>> author
>>>> to make groups that make sense. Power users could override this using
>>>> user
>>>> JavaScript or other browser extensions if they really want to.
>>>
>>> I'd actually prefer the opposite functionality - and that would also
>>> be much more like what is in a media resource:
>>>
>>> <track>s in a list without <trackgroup> can be activated in parallel -
>>> they are like non-grouped MP4 tracks.
>>>
>>> <track>s inside a <trackgroup> are mutually exclusive - only one of
>>> them can be activated at any point in time.
>>>
>>> IIUC, that's how grouping works in MP4 and QuickTime and thus applying
>>> this same principle here seems to make sense to me. Thus, if you
>>> didn't want tracks to be active together, you'd pack them in a
>>> trackgroup. Much easier than having to package each single <track> in
>>> a <trackgroup> to enable them to be active in parallel.
>>
>> If I understand you, the only difference is the semantics when
>> <trackgroup>
>> is omitted. We can make either behavior the default. What it comes down
>> to
>> is what authors actually expect and which case is more common. I don't
>> know
>> anything about MPEG-4, but I do know that for any file with multiple
>> text
>> tracks I have opened in any media player (software or hardware), tracks
>> have
>> been mutually exclusive.
>
> That's probably because the only text tracks that have been regarded
> so far were captions and subtitles. It's easy to only deal with such
> as mutually exclusive. I think we have new opportunities here for the
> Web. We have, for example, the possibility to bring in textual audio
> descriptions that can hook up with screenreaders - something that has
> not been done before (or at least not in a big scale).

I agree entirely.

> I actually believe that in future we may see a lot more files that
> have both, mutually exclusive tracks and additional tracks. Things
> such as:
>
> <video src="video.ogv">
>        <track src="cc.en.srt" srclang="en" role="CC" active>
>        <track src="tad.en.srt" srclang="en" role="TAD">
>        <track role="SUB">
>            <source src="subs.de.srt" srclang="de">
>            <source src="subs.sv.srt" srclang="sv">
>            <source src="subs.jp.srt" srclang="jp">
>        </track>
>  </video>
>
> In this case, the CC, the TAD and and one of the SUBs can (but don't
> have to) be active in parallel.

Hopefully we will see more captions, textual audio descriptions and other
text tracks that make the video more useful to all users. However, it
seems extremely likely that this usage will be an order of magnitude less
common than simple multi-language subtitles (mutually exclusive) case.

GF:  True, simultaneous displays from a single role *may* be the exception rather than the rule, but I'm still of the mind that the user should have that option.

>> (Also, I don't want to come up with a UI for enabling multiple tracks
>> when
>> there is no grouping, as it would have to be something strange like a
>> list
>> of checkboxes in a context menu.)
>
> The UI for this would be:
>
> CC -> en
> TAD -> en
> SUB
>       -> de
>       -> sv
>       -> jp
>
> This is a bit inspired by the menu of YouTube, but extended with all
> the other text tracks.

Would the following markup be equivalent?

<video src="video.ogv">
   <track src="cc.en.srt" srclang="en" role="CC" active>
   <track src="tad.en.srt" srclang="en" role="TAD">
   <track role="SUB" src="subs.de.srt" srclang="de">
   <track role="SUB" src="subs.sv.srt" srclang="sv">
   <track role="SUB" src="subs.jp.srt" srclang="jp">
</video>

In other words, is it role="" alone that determines grouping or does the
<track><source> nesting have some impact?

If <track><source> is *only* a shortcut to save some typing with no
semantic difference from a flat list, then it seems that the only thing of
substance we disagree on is if tracks should be mutually exclusive by
default or not and what mechanism should be used to opt out of the default
behavior. To summarize:

<track><source>: role="" determines which "tracks" are mutually exclusive;
opt out by using another role.

<trackgroup><track>: <track>s are mutually exclusive by default. Opt out
by putting them in different <trackgroup>s.

On Tue, 16 Feb 2010 20:28:39 +0800, Geoff Freed <geoff_freed@wgbh.org>
wrote:

> GF:  I don't really think of this as punishing the author.  I do think
> you're correct in assuming that in most cases, users will want to
> activate a single track per role.  But I don't think we should rule out
> simultaneously visible tracks of the same role.  For example, in the
> case of role=CC, reading a verbatim caption track alongside another
> caption track that has been edited to a specific reading level would be
> a useful pedagogical tool.  My point is simply that *users* should be
> able to determine what tracks are active at any given time.  If I want
> to see three subtitle tracks at once, why should the author stop me?

<trackgroup> makes the grouping explicit and it is up to the author to
decide which tracks can be shown simultaneously and provide the CSS that
make the visual rendering sane. I would expect the default browser UI to
respect the grouping that the author has made, but users can always
override this with user JavaScript and user CSS or browser extensions.

GF:  I don't think we should be relying at all on users knowing how to create their own scripts or CSS.  Browsers may offer capabilities for local customization via these mechanisms, but I'm going to guess that the average person has no idea that these things exist or how to create them.

--
Philip Jägenstedt
Core Developer
Opera Software

Received on Tuesday, 16 February 2010 21:38:25 UTC