Re: Timed tracks

On Thu, May 6, 2010 at 2:11 PM, Frank Olivier <franko@microsoft.com> wrote:
>
> SRT is far from the closest-to-ideal format for anyone that’s actually involved in real captioning.
>
> The features one would need to add to SRT:
> 1) Make it amenable to CSS styling. Which means making it have a regular syntax that CSS selectors will work on (XML would spring to mind), or embedding styling information in the format itself.
> 2) Enable it to support internationalised text, including bidi rules horizontal and vertical line and block layout.
> 3) Enabling it to be positioned precisely with respect to elements in the video to avoid spilling over burned in text
> 4) Supporting Ruby display
> 5) Supporting multiple simultaneous captions for turn taking dialogue. Including positioning them differently in relation to video.
> 6) Specifying text backgrounds with/without opacity.
> 7) Supporting common caption idioms like roll-up, word at a time and line at a time pop-on. To take advantage of TV caption data.
> 8) Supporting inline styling for emphasized words.

#1-6 are present in WebSRT or the bindings it exposes to CSS.  #8 is
supported, just not through inline styling (you can, however,
emphasize words, and then style them specially with CSS).  #7 is the
only thing not supported right now, but doesn't appear to be a
necessity so far.  (Streaming captions may require something like
that, but I don't believe they're supported quite yet.)


> These features need to be supported by the HTML embedding:
> 7) Closing the content off from the host HTML page when it comes from a different domain (while preserving 1) to protect IP.
> 8) Enabling the video owner to supply the styling independent of the host page.
>
> Once you've done all that, you are going to be looking at something very similar to TTML.

Both of these are anti-features.  There may be security reasons to
restrict script access to the content of cross-origin tracks, but
restricting it to "protect IP" is not something we want to do.

Similarly with packaging formatting together with the video.  The
pattern used on the web is that the consumer is the one who gets to
decide how to style something.  If you need absolute control,
<iframe>s are the way to go.


> No browser needs to implement XSL:FO to support TTML, that is a complete red herring. Flash and Silverlight have been using it for 2 years or more and neither is using XSL:FO. There is no need to write a spec defining how layout "primitives should be interpreted", the TTML spec already does that.

Flash and Silverlight aren't concerned with being interoperable.  They
can rely on hacking something together that works well enough.  The
open web works somewhat differently.


> In fact there is simply no need to integrate the caption rendering into the HTML rendering at all, it is embedded content and should be handled as such.

Either captions are rendered using existing layout technology, or
browsers have to write new code to duplicate the layout functionality
in slightly different ways.  The latter isn't very acceptable when
there are alternatives.

~TJ

Received on Thursday, 6 May 2010 21:39:20 UTC