Re: [media] Moving forward with captions / subtitles

On Sat, 06 Feb 2010 11:41:10 +0800, Silvia Pfeiffer  
<silviapfeiffer1@gmail.com> wrote:

> Hi all,
>
> In a separate thread, there was an extensive discussion about what
> declarative markup we should propose to add to HTML5 to introduce a
> standard means for associating captions and subtitles that are
> provided in separate files with an audio or video element.
>
> The discussion relates to bug 5758 and has currently resulted in a
> first draft specification at:
> http://www.w3.org/WAI/PF/HTML/wiki/Media_TextAssociations
>
> As you will notice, this proposal builds on several other previously
> suggested declarative syntax proposals.
>
> This is a request for discussion of this specification with a view
> towards getting an agreement both in the media subgroup and the larger
> task force on two issues:
>
> 1. The proposed declarative markup

I think the main outstanding problem is still a good name for the grouping  
element. <textassoc> isn't great either because it's a bit difficult to  
spell. Perhaps <track>? Even though several browser vendors are skeptical  
of syncing audio/video from two different resources, it would make it  
spec-wise possible to allow it in the future. For now, it's text-only  
though:

<video src="video.ogg">
   <track src="captions.srt">
</video>

or using <source> in the same way as for <video>:

<video src="video.ogg">
   <track>
     <source type="text/srt" src="captions.srt" lang="en">
     <source type="text/srt" src="zimu.srt lang="zh">
   </track>
</video>

Note that the resource selection algorithm is not a limitation here,  
because we can freely define how <track>s are activated and how to select  
between alternative <source>s in a <track>. Perhaps we need a new boolean  
attribute like enabled="" to enable a track by default.

I'm not to keen on charset="", but if it turns out to be necessary in  
practice I'll just have to tolerate it. For now we have to define how the  
default encoding is determined (*not* using the containing document's  
encoding for sanity).

role="" is fine, but I'd like to see more ideas on what UAs should to with  
it.

> 2. The proposed default file formats to support for subtitles and
> captions: DFXP and SRT

I'm really quite skeptical of DFXP, as it tries to be an interchange  
format that can handle anything that any other subtitle format can handle.  
I think browser support would at best be partial. The overlap with what  
can already be done with script+CSS is also very large and I'd rather at  
least rendering is defined entirely by CSS.

SRT is a good baseline, but who will take on the task of writing a parsing  
spec for it? This is probably not nearly as trivial as it may seem, and  
requires collecting lots of sample data to see how SRT is used in the  
wild. For example, what to do with the ghastly "HTML" sometimes mixed in  
SRT? (I would prefer this to be completely unsupported.)

-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Saturday, 13 February 2010 10:20:04 UTC