Re: [blink-dev] WebVTT vs TTML Features

On Wed, Dec 11, 2013 at 7:08 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com>wrote:

> On Wed, Dec 11, 2013 at 8:09 AM, Glenn Adams <glenn@chromium.org> wrote:
> >
> > On Wed, Dec 11, 2013 at 2:50 AM, David Singer <singer@apple.com> wrote:
> >>
> >>
> >> On Dec 9, 2013, at 11:31 , Glenn Adams <glenn@chromium.org> wrote:
> >>
> >> > Another significant design difference between TTML and WebVTT comes
> into
> >> > play here as well: TTML was designed for smart authoring systems and
> dumb
> >> > clients, while WebVTT was designed for dumb authoring systems and
> smart
> >> > clients.
> >>
> >> I don’t think this can possibly be true.  The client-side implementation
> >> of VTT is vastly simpler than TTML, and indeed does not require
> profiles, or
> >> complicated specification.
> >
> >
> > This is speculation, since we don't have a reasonable open-source client
> > implementation of TTML to compare against. I do know that VTT requires a
> > number of things that TTML does not require, including:
> >
> > a parser
> >
> > TTML would reuse existing an XML or generic HTML markup parser, while VTT
> > requires a new parser
> >
> > logic to perform overlap avoidance and other automatic functions
> expected to
> > be performed by VTT
> >
> > TTML assigns this responsibility to the authoring system, not the client
>
> VTT assigns this responsibility to both the authoring system AND the
> client, since the authoring system cannot foresee everything. If,
> e.g., the client changes the fontsize or the size of the video element
> substantially over what the author expected, the client has to deal
> with overlap no matter what.
>
>
> > A TTML client does not have to process any profile information, e.g., it
> can
> > be built to support one or more specific, pre-defined profiles (of
> feature
> > sets).
> >
> > It's too early to say how complicated the VTT specification will be. In
> > fact, if you review the number of algorithmic steps specified in the
> current
> > VTT draft, it vastly exceeds the number of algorithmic steps specified in
> > TTML.
> >
> >>
> >> > That these design choices are very different will continue to stymy
> >> > efforts to unify the two intentionally different expressions of timed
> text
> >> > content.
> >>
> >> Since TTML’s initial “raison d’etre” was as a flexible authoring system,
> >> this would seem to be a problem.  I doubt that it’s true.
> >
> >
> > It is certainly true that one could extend TTML by defining new style
> > extension properties that semantically map to the VTT style semantics
> that
> > support automatic region overlap avoidance, etc., but the opposite isn't
> > true, i.e., VTT doesn't yet support an authoring defined region placement
> > model,
>
> It most certainly does! VTT regions are exactly that: explicitly
> placed regions by the author.


I should have been more specific. VTT does not presently support pixel
addressing for region placement.


> BTW: even normal cues can be explicitly
> placed by the author, but the rendering engine is allowed to move them
> somewhat in case of overlap. So, VTT has a dual model for region
> placement: one where the author is 100% in control of placement and
> one where the author makes an informed decision, but the final word
> stays with the browser when it sees problems with the author's
> decision.
>
>
> > and doesn't support a number of other stylistic,
>
> Anything major that is missing in this respect should indeed be fixed.
>
> > or timing functions.
>
> What do you mean by that? I am not aware of any missing features wrt
> "timing functions".
>

Here I was referring to support for:

   - SMPTE code time expressions
   - wall clock time expressions
   - frame addressing in time expressions
   - sub-frame addressing in time expressions

VTT supports only a media time base model with time expressions that map to
NPT.


>
>
> > Over time, it may be possible to add support for some such functions to
> VTT,
> > but I tend to doubt the mapping will ever be complete.
>
> Are you saying we should not even try coming up with a mapping?


No. I think various folks (including me) are already working on such a
mapping. The question is to what extent it will be complete and remain
complete.


> I
> think that's not useful to the industry. Sure, both standards will
> evolve, so the mapping will need continuous updating. But that's a
> fact of life.
>
>
> >> I also would love to see a report of TTML as *used*, what features are
> >> actually used, and so on. It would really help, and perhaps help inform
> the
> >> profile definitions that are currently underway.
> >
> >
> > Yes, I would like to see that also, however, a more useful compilation
> might
> > be an enumeration of the set of all features used with real world
> deployed
> > caption/subtitling/teletext systems. When compared with existing systems
> > (608/708/World Teletext/etc), the current usage of TTML and VTT is in its
> > infancy.
>
> Right. That's why in the first instance VTT based its requirements on
> the existing features of 608/708 and would love to get input from
> Teletext requirements that cannot be satisfied with current VTT.
>
> Regards,
> Silvia.
>

Received on Tuesday, 10 December 2013 23:33:34 UTC