Re: [blink-dev] WebVTT vs TTML Features

Hi Glenn, Victor, all,

This took a bit of time to prepare a reply for, but I think it will
also give us a start at mapping TTML to WebVTT. So, see my analysis
inline below.


On Wed, Nov 27, 2013 at 7:37 AM, Victor Cărbune <vcarbune@chromium.org> wrote:
> (bcc: blink-dev, cc: public-texttracks)
>
> Hi Glenn,
>
> I'm moving the discussion to public-texttracks@ because I think these are
> good points that should generally be debated and eventually extend WebVTT to
> support some of them, if needed by caption authors.
>
> Victor
>
>
> On Tue, Nov 26, 2013 at 3:39 PM, Glenn Adams <glenn@chromium.org> wrote:
>>
>>
>>
>>
>> On Tue, Nov 26, 2013 at 3:25 PM, Silvia Pfeiffer <silviapf@chromium.org>
>> wrote:
>>>
>>>
>>> Have they tried to convert from TTML to WebVTT for presentation in
>>> browsers? Since all major browsers now support WebVTT, it would the
>>> path of least pain. It would also help to find out which TTML features
>>> cannot be presented in WebVTT. You might find that to be a very small
>>> set.
>>
>>
>> I expect that greater than 50% of TTML features aren't translatable into
>> WebVTT.

Are those features actually used anywhere in the real world? Are they
features that CEA608 or CE708 supports?

Also, rather then redefining styling properties in WebVTT, we're
simply relying on CSS properties in WebVTT.
There are many CSS properties that are allowed to be applied on a
::cue or ::cue-region , see
http://dev.w3.org/html5/webvtt/#applying-css-properties-to-webvtt-node-objects
and
http://dev.w3.org/html5/webvtt/#css-extensions

Note also that we have an open bug to introduce inline CSS
functionality into WebVTT:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15023
It's an extension because you can already apply CSS properties via the
Web page, but it's a feature that's on the roadmap.


>> For example, TTML1 makes use of 24 style properties [1], all based
>> on CSS or SVG properties (in most cases identically defined). Of these 24,
>> the following 10 cannot be expressed in whole or part by WebVTT content:

To interprete these accurately, I can't just look at CSS or SVG, but
have to reference their special meaning in the TTML spec. To start
creating a mapping, I'll include a reference for each of the
properties and then explain how to do them in WebVTT.


>> backgroundColor
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-backgroundColor
values: CSS color names

In WebVTT: http://dev.w3.org/html5/webvtt/#css-extensions
allows setting of all CSS 'background' properties, including
background-color on:

* cue content:
  ::cue(selector) [selector addresses a tag in the cue text]

* cues:
  ::cue [for all cues] or via ::cue(#cue-id) [for individual cues]

* cue groups:
  ::cue-region [for all regions] or via ::cue-region(#region-id) [for
individual regions]


>> display
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-display
values: auto/none

In WebVTT:
If an author is trying author a cue that doesn't get displayed,
they're best off breaking the cue parsing, e.g. by replacing "-->"
with "->".

If they want to do this dynamically during display in the browser,
they can use always turn the TextTrack.mode to hidden or disabled
(http://www.w3.org/html/wg/drafts/html/master/single-page.html#texttrackmode)
during the duration of the cue.


>> displayAlign
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-displayAlign
values: before/center/after

I'm having a hard time understanding what that does. Is this about
vertical alignment of the text in the region? I'll assume that for
now.

In WebVTT:
This isn't currently possible. However, we discussed at FOMS to allow
authors to specify line and position settings within regions the same
as within the video viewport. So that would allow exact specification
of where the content within a region needs to be painted.


>> extent
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-extent
values: audio or <length> <length>

In WebVTT:
We specify the width and number of lines of a region in the region
definition, see
https://dvcs.w3.org/hg/text-tracks/raw-file/default/608toVTT/region.html#webvtt-region-metadata-header-syntax
Lengths are not in pixels because they make little sense when the
video viewport or characters are your reference of authoring.
Right now we have number of lines and percentages.
At FOMS we discussed to also use em (i.e. largest character width and
height) as a metric, which maps better to 708.


>> origin
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-origin
values: audio or <length> <length>

In WebVTT:
We specify the placement of regions via anchor settings, see
https://dvcs.w3.org/hg/text-tracks/raw-file/default/608toVTT/region.html#webvtt-region-viewport-anchor
and
https://dvcs.w3.org/hg/text-tracks/raw-file/default/608toVTT/region.html#webvtt-region-anchor
.

This allows placing a region anywhere on screen which explicitly
specifies how it grows around the anchor point with increasing font
size, which maps exactly onto how 708 does its placement. I don't
think that capability is available in TTML - could you clarify?


>> overflow
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-overflow
values: visible / hidden

In WebVTT:
We specifically try to never obscure any content when rendering cues.
The only exception is rendering of cues inside regions, which have a
limited number of lines that they render. There, when a cue runs
outside the region, it becomes hidden. This is the intended result for
scrolling captions.


>> padding
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-padding
values: <length> <length?> <length?> <length?> (1-4 values)

In WebVTT:
We specify a safe rendering area according to 708 which provides
padding on the video viewport, defaulting to 1.5% of width and height
on all sides. Otherwise, "padding" is indeed not yet listed as one of
the properties that can be set for cues or regions in
http://dev.w3.org/html5/webvtt/#css-extensions . This could be
something to consider adding. It would be simple to add, too, since we
just refer to CSS for such properties.


>> showBackground
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-showBackground
values: always | whenActive
IIUC this is intended to have cues show up even when no content is rendered.

In WebVTT:
We don't usually render cues that have no content.
We can, however, define a region with a given dimension and render an
n-line cue into it just with &nbsp; characters - that would have the
same effect.
What is the use case for rendering regions/cues without content?


>> wrapOption
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-wrapOption
values are: wrap | noWrap
IIUC this specifies automatic word wrapping.

In WebVTT:
Newline characters in cues as authored are rendered as new lines in
WebVTT, since WebVTT is a line-based format and not XML-based.

If text lines get too long within their containing blocks, cues wrap
according to CSS rules at the edge of their containing blocks. When
word-wrap occurs, WebVTT will try to balance multiple lines so as to
provide the best possible user experience.

WebVTT tries hard not to hide any text, so no-wrapping and hiding the
overflow is avoided. What is the use cases for non-wrapping?


>> zIndex
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-zIndex
values are auto | <integer>
This resolves what is rendered on top of what when there is overlap.

In WebVTT:
WebVTT tries very hard to avoid overlap. There is even an algorithm to
move cues into spare screen real estate when (due to font increase or
cue clash over multiple simultaneously active tracks or poor authoring
for smaller video viewports) two cues overlap.

The repositioning of cues only happens to simple cues. Regions are
explicitly placed and it's possible for them to overlap. There was
originally a proposal to introduce a "layer" setting on regions (see
http://www.w3.org/community/texttracks/wiki/MultiCueBox#Layering_of_cues)
and this may still eventuate.


>> The following can be expressed, but not in a WebVTT file, only in a CSS
>> stylesheet associated with the page in which the WebVTT HTML/CSS
>> presentation will be rendered:
>>
>> color
>> fontFamily
>> fontSize
>> fontStyle
>> fontWeight
>> lineHeight
>> opacity
>> textDecoration
>> textOutline
>> visibility

Right. As discussed above, that was a design decision for WebVTT. We
have a proposal for in-line styling, too.


>> Support for the following TTML (CSS) properties require mutating the text
>> to insert or modify explicit bidi control codes:
>>
>> direction
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-direction
values: ltr | rtl

In WebVTT:
We rely on the text being authored UTF-8 compliant for its language.
rtl text starts with a rtl mark when authored as rtl. This allows us
to fully support bi-directional text containing mixed left-to-right
scripts.


>> unicodeBidi
In TTML: http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-unicodeBidi
values: normal | embed | bidiOverride

In WebVTT:
Rather than having to author this as a style attribute on a span or p
element, we simply rely on the bidi algorithm and the correct use of
bidi UTF-8 characters in the text. If you have to deal with a text
that is not appropriately authored, you have to insert something to
fix this. In the case of TTML it's <p> or <span> elements with styling
attributes, in the case of WebVTT it's UTF-8 characters. Both "mutate
the text".


>> So nearly half (ten) of the style properties do not translate at all or
>> only in part, and ten other style properties require use of separate style
>> sheets that have to be delivered independently from the related WebVTT file.

You might like to adjust your counting after the information provided above.


>> Overally, TTML1 defines 114 features [2], 69 of which are related to the
>> above 24 style properties.

How many of these 114 features are actually in use in the wild? WebVTT
has a strong drive to only support features that are motivated by a
use case. If any of these features are necessary, WebVTT can be
extended to support them. Some of them (like the 'padding' above)
would be fixed simply by adding the feature to the list of supported
CSS properties, which takes less than 5min to fix. It would be best
for us to find this out before we freeze the spec. Any input on use
cases would be welcome.


>> I fully expect that more than half of these
>> features are not encodable or translatable to WebVTT, or if they are, then
>> have the added disadvantage of having to maintain a separate CSS style sheet
>> containing rules that apply to specific VTT files.

Why is that a disadvantage? Separating the styling from the content
has been a driving design principle of the Web and has been part of
the cause for the success of HTML. I don't see how that would be a
disadvantage.

Best Regards,
Silvia.


>> [1]
>> http://www.w3.org/TR/2013/REC-ttml1-20130924/#styling-attribute-vocabulary
>> [2] http://www.w3.org/TR/2013/REC-ttml1-20130924/#feature-designations
>>
>

Received on Monday, 9 December 2013 03:16:55 UTC