Re: [blink-dev] WebVTT vs TTML Features from Silvia Pfeiffer on 2013-12-10 (public-texttracks@w3.org from December 2013)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Wed, 11 Dec 2013 10:08:25 +1100
To: Glenn Adams <glenn@chromium.org>
Cc: David Singer <singer@apple.com>, Victor Carbune <vcarbune@chromium.org>, Silvia Pfeiffer <silviapf@chromium.org>, "public-texttracks@w3.org" <public-texttracks@w3.org>, Nigel Megitt <nigel.megitt@bbc.co.uk>
Message-ID: <CAHp8n2mKiEO1RNSv9TxKEyeY8yf7OwpeoACjpPG59_CWF0_nAg@mail.gmail.com>

On Wed, Dec 11, 2013 at 8:09 AM, Glenn Adams <glenn@chromium.org> wrote:
>
> On Wed, Dec 11, 2013 at 2:50 AM, David Singer <singer@apple.com> wrote:
>>
>>
>> On Dec 9, 2013, at 11:31 , Glenn Adams <glenn@chromium.org> wrote:
>>
>> > Another significant design difference between TTML and WebVTT comes into
>> > play here as well: TTML was designed for smart authoring systems and dumb
>> > clients, while WebVTT was designed for dumb authoring systems and smart
>> > clients.
>>
>> I don’t think this can possibly be true.  The client-side implementation
>> of VTT is vastly simpler than TTML, and indeed does not require profiles, or
>> complicated specification.
>
>
> This is speculation, since we don't have a reasonable open-source client
> implementation of TTML to compare against. I do know that VTT requires a
> number of things that TTML does not require, including:
>
> a parser
>
> TTML would reuse existing an XML or generic HTML markup parser, while VTT
> requires a new parser
>
> logic to perform overlap avoidance and other automatic functions expected to
> be performed by VTT
>
> TTML assigns this responsibility to the authoring system, not the client

VTT assigns this responsibility to both the authoring system AND the
client, since the authoring system cannot foresee everything. If,
e.g., the client changes the fontsize or the size of the video element
substantially over what the author expected, the client has to deal
with overlap no matter what.

> A TTML client does not have to process any profile information, e.g., it can
> be built to support one or more specific, pre-defined profiles (of feature
> sets).
>
> It's too early to say how complicated the VTT specification will be. In
> fact, if you review the number of algorithmic steps specified in the current
> VTT draft, it vastly exceeds the number of algorithmic steps specified in
> TTML.
>
>>
>> > That these design choices are very different will continue to stymy
>> > efforts to unify the two intentionally different expressions of timed text
>> > content.
>>
>> Since TTML’s initial “raison d’etre” was as a flexible authoring system,
>> this would seem to be a problem.  I doubt that it’s true.
>
>
> It is certainly true that one could extend TTML by defining new style
> extension properties that semantically map to the VTT style semantics that
> support automatic region overlap avoidance, etc., but the opposite isn't
> true, i.e., VTT doesn't yet support an authoring defined region placement
> model,

It most certainly does! VTT regions are exactly that: explicitly
placed regions by the author. BTW: even normal cues can be explicitly
placed by the author, but the rendering engine is allowed to move them
somewhat in case of overlap. So, VTT has a dual model for region
placement: one where the author is 100% in control of placement and
one where the author makes an informed decision, but the final word
stays with the browser when it sees problems with the author's
decision.

> and doesn't support a number of other stylistic,

Anything major that is missing in this respect should indeed be fixed.

> or timing functions.

What do you mean by that? I am not aware of any missing features wrt
"timing functions".

> Over time, it may be possible to add support for some such functions to VTT,
> but I tend to doubt the mapping will ever be complete.

Are you saying we should not even try coming up with a mapping? I
think that's not useful to the industry. Sure, both standards will
evolve, so the mapping will need continuous updating. But that's a
fact of life.

>> I also would love to see a report of TTML as *used*, what features are
>> actually used, and so on. It would really help, and perhaps help inform the
>> profile definitions that are currently underway.
>
>
> Yes, I would like to see that also, however, a more useful compilation might
> be an enumeration of the set of all features used with real world deployed
> caption/subtitling/teletext systems. When compared with existing systems
> (608/708/World Teletext/etc), the current usage of TTML and VTT is in its
> infancy.

Right. That's why in the first instance VTT based its requirements on
the existing features of 608/708 and would love to get input from
Teletext requirements that cannot be satisfied with current VTT.

Regards,
Silvia.

Received on Tuesday, 10 December 2013 23:09:12 UTC