Re: [media] Moving forward with captions / subtitles

On Wed, 17 Feb 2010 03:20:41 +0800, Silvia Pfeiffer  
<silviapfeiffer1@gmail.com> wrote:

> On Wed, Feb 17, 2010 at 1:55 AM, Philip Jägenstedt <philipj@opera.com>  
> wrote:
>> On Tue, 16 Feb 2010 20:11:50 +0800, Silvia Pfeiffer
>> <silviapfeiffer1@gmail.com> wrote:
>>
>>> On Tue, Feb 16, 2010 at 7:19 PM, Philip Jägenstedt <philipj@opera.com>
>>> wrote:
>>>>
>>>> On Tue, 16 Feb 2010 15:46:00 +0800, Silvia Pfeiffer
>>>> <silviapfeiffer1@gmail.com> wrote:
>>>>
>>>>> On Tue, Feb 16, 2010 at 6:37 PM, Philip Jägenstedt  
>>>>> <philipj@opera.com>
>>>>> wrote:
>>>>>>
>>>>>> On Tue, 16 Feb 2010 04:36:09 +0800, Geoff Freed  
>>>>>> <geoff_freed@wgbh.org>
>>>>>> wrote:
>>>>>>
>>>>>>> GF:  I prefer <trackgroup><track> as well--  grouping tracks by  
>>>>>>> role
>>>>>>> makes
>>>>>>> the most sense to me. But I'm still confused about one thing after
>>>>>>> reading
>>>>>>> today's thread.  From this markup, it looks to me like
>>>>>>> <trackgroup><track>
>>>>>>> also would permit multiple tracks of the same role to appear
>>>>>>> simultaneously.
>>>>>>>  True?  Playing simultaneous tracks of the same role is still what  
>>>>>>> I'd
>>>>>>> prefer (in addition to playing simultaneous tracks of differing  
>>>>>>> roles,
>>>>>>> of
>>>>>>> course).
>>>>>>
>>>>>> My idea is that <trackgroup> be used to group mutually exclusive
>>>>>> tracks,
>>>>>> independently of their roles. I struggle to come up with an example
>>>>>> when
>>>>>> you
>>>>>> would want it, but if you wrap each <track> in its own <trackgroup>
>>>>>> then
>>>>>> *all* tracks can be enabled simultaneously. It is of course up to  
>>>>>> the
>>>>>> author
>>>>>> to make groups that make sense. Power users could override this  
>>>>>> using
>>>>>> user
>>>>>> JavaScript or other browser extensions if they really want to.
>>>>>
>>>>> I'd actually prefer the opposite functionality - and that would also
>>>>> be much more like what is in a media resource:
>>>>>
>>>>> <track>s in a list without <trackgroup> can be activated in parallel  
>>>>> -
>>>>> they are like non-grouped MP4 tracks.
>>>>>
>>>>> <track>s inside a <trackgroup> are mutually exclusive - only one of
>>>>> them can be activated at any point in time.
>>>>>
>>>>> IIUC, that's how grouping works in MP4 and QuickTime and thus  
>>>>> applying
>>>>> this same principle here seems to make sense to me. Thus, if you
>>>>> didn't want tracks to be active together, you'd pack them in a
>>>>> trackgroup. Much easier than having to package each single <track> in
>>>>> a <trackgroup> to enable them to be active in parallel.
>>>>
>>>> If I understand you, the only difference is the semantics when
>>>> <trackgroup>
>>>> is omitted. We can make either behavior the default. What it comes  
>>>> down
>>>> to
>>>> is what authors actually expect and which case is more common. I don't
>>>> know
>>>> anything about MPEG-4, but I do know that for any file with multiple  
>>>> text
>>>> tracks I have opened in any media player (software or hardware),  
>>>> tracks
>>>> have
>>>> been mutually exclusive.
>>>
>>> That's probably because the only text tracks that have been regarded
>>> so far were captions and subtitles. It's easy to only deal with such
>>> as mutually exclusive. I think we have new opportunities here for the
>>> Web. We have, for example, the possibility to bring in textual audio
>>> descriptions that can hook up with screenreaders - something that has
>>> not been done before (or at least not in a big scale).
>>
>> I agree entirely.
>>
>>> I actually believe that in future we may see a lot more files that
>>> have both, mutually exclusive tracks and additional tracks. Things
>>> such as:
>>>
>>> <video src="video.ogv">
>>>       <track src="cc.en.srt" srclang="en" role="CC" active>
>>>       <track src="tad.en.srt" srclang="en" role="TAD">
>>>       <track role="SUB">
>>>           <source src="subs.de.srt" srclang="de">
>>>           <source src="subs.sv.srt" srclang="sv">
>>>           <source src="subs.jp.srt" srclang="jp">
>>>       </track>
>>>  </video>
>>>
>>> In this case, the CC, the TAD and and one of the SUBs can (but don't
>>> have to) be active in parallel.
>>
>> Hopefully we will see more captions, textual audio descriptions and  
>> other
>> text tracks that make the video more useful to all users. However, it  
>> seems
>> extremely likely that this usage will be an order of magnitude less  
>> common
>> than simple multi-language subtitles (mutually exclusive) case.
>
> IIUC, in your markup the same would be:
>
> <video src="video.ogv">
>     <trackgroup>
>         <track src="cc.en.srt" srclang="en" role="CC" active>
>     </trackgroup>
>     <trackgroup>
>         <track src="tad.en.srt" srclang="en" role="TAD">
>     </trackgroup>
>     <trackgroup role="SUB">
>        <track src="subs.de.srt" srclang="de">
>         <track src="subs.sv.srt" srclang="sv">
>         <track src="subs.jp.srt" srclang="jp">
>      </trackgroup>
> </video>
>
> while with the default case they compare for mutually exclusive:
>
> <video src="video.ogv">
>      <track role="SUB">
>          <source src="subs.de.srt" srclang="de">
>          <source src="subs.sv.srt" srclang="sv">
>          <source src="subs.jp.srt" srclang="jp">
>      </track>
> </video>
>
> <video src="video.ogv">
>          <track src="subs.de.srt" srclang="de" role="SUB">
>          <track src="subs.sv.srt" srclang="sv" role="SUB">
>          <track src="subs.jp.srt" srclang="jp" role="SUB">
> </video>
>
> I think that the existing proposed markup is less talkative and more  
> intuitive.
>
>>>> (Also, I don't want to come up with a UI for enabling multiple tracks
>>>> when
>>>> there is no grouping, as it would have to be something strange like a
>>>> list
>>>> of checkboxes in a context menu.)
>>>
>>> The UI for this would be:
>>>
>>> CC -> en
>>> TAD -> en
>>> SUB
>>>      -> de
>>>      -> sv
>>>      -> jp
>>>
>>> This is a bit inspired by the menu of YouTube, but extended with all
>>> the other text tracks.
>>
>> Would the following markup be equivalent?
>>
>> <video src="video.ogv">
>>  <track src="cc.en.srt" srclang="en" role="CC" active>
>>  <track src="tad.en.srt" srclang="en" role="TAD">
>>  <track role="SUB" src="subs.de.srt" srclang="de">
>>  <track role="SUB" src="subs.sv.srt" srclang="sv">
>>  <track role="SUB" src="subs.jp.srt" srclang="jp">
>> </video>
>>
>> In other words, is it role="" alone that determines grouping or does the
>> <track><source> nesting have some impact?
>
> It's not identical because in this case you can have all SUBs active
> at the same time, while in the existing markup only one of the
> subtitles would be active at any point in time.

So in other words role="" has *no* influence over which tracks are  
mutually exclusive, it is only the <track><source> nesting that creates  
group of mutually exclusive tracks. That makes <trackgroup><track> and  
<track><source> almost identical apart from naming and the minimized form.

In both cases:

The inner element references an external text track, possibly with a  
language and role.

The outer element makes the children mutually exclusive. Language and role  
may be specified on this element as well, in which case it is inherited by  
the children.

For the inner element, I think both <source> and <track> are OK names.  
<source> has the upside of being an existing element, while the downside  
is that it requires the outer element to not conflict with the sources of  
<audio>/<video>.

For the outer element, I think <track> is a very bad name, as it doesn't  
represent a track at all. <trackgroup> is a bit more verbose, but is  
exactly what it claims to be.

The minimized form of <track><source> (omitting <source>) makes all tracks  
parallel while the minimized form of <trackgroup><track> (omitting  
<trackgroup>) makes all track mutually exclusive.

When it comes to UI, I think <trackgroup><track> is better because it  
reflects exactly what a sensible menu nesting could look like, while the  
minimized form of <track><source> would have a less direct mapping (or a  
menu with checkboxes or similar).

We could also completely drop the nesting and introduce a grouping  
attribute on <track>, but I don't think that would be better than either  
existing proposal.

-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Wednesday, 17 February 2010 05:32:26 UTC