Re: TTML and aspect ratio

On Tue, Jan 29, 2013 at 11:29 PM, David Ronca <dronca@netflix.com> wrote:

> CEA-608 mandates a 32x15 grid in a 4:3 display area.  CEA-708 allows
> either 32x15 (4:3) or 42:15 (16:9).  For 4:3, the position 10%,10% is a
> different location on the screen than it is for 16:9 content.. Further,
> displaying CEA-608 caption in a 16x9 display area will result in very poor
> presentation (as will presenting 16x9 CEA-708 caption in a 4:3 display
> area).  In reviewing the W3C-TTML, we can find no well-defined means to
> inform the client of the proper aspect ratio of the caption display area.
> My question is how should a client determine the correct aspect ratio for
> the caption viewing area?
>

What you refer to as "caption display area" is called "root container
region" by TTML 2.2 [1]:

Root Container Region

A logical region that establishes a coordinate system into which content
regions are placed and optionally clipped.
The origin and extent of this container region is specified according to
TTML 7.1.1 [2]:

If the tts:extent attribute is specified on the tt element, then it must
adhere to *8.2.7
tts:extent*<https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml10/spec/ttaf1-dfxp.html#style-attribute-extent>,
in which case it specifies the spatial extent of the root container region
in which content regions are located and presented. If no tts:extent attribute
is specified, then the spatial extent of the root container region is
considered to be determined by the external authoring or presentation
context. The root container origin is determined by the external authoring
context.

The origin of non-root region areas generated by tt:region elements are
located in relationship with the origin of the root container region,
forming a single level hierarchy of regions.

As presently defined by TTML, the origin of the root container region is
determined by the external presentation context. [Note that the phrase
"external authoring context" should be interpreted as external presentation
context. In fact, the current text of TTML should probably be changed in
this regard to always refer to external presentation context instead.] In
other words, as presently defined, the root container origin is always
determined by the presentation processor (or transformation processor when
performing transformations).

One possible origin is the related media object's origin, another is some
offset that corresponds to the typical safe area for the related media
object. So, e.g., for a 4:3 480i NTSC related media we have:

SAR = 704:480
PAR = 10:11
DAR = 640:480

For the TTML author, either SAR pixels or DAR pixels may be used for
coordinates. If SAR is used, then in this case, the author would need to
specify ttp:pixelAspectRatio='10 11' on the tt element. So, for example,
the author might specify:

<tt ttp:pixelAspectRatio='10 11' tts:extent='704px 480px'>

or, if the coordinates are in DAR pixels instead of SAR pixels, then

<tt tts:extent='640px 480px'>

in this case the default ttp:pixelAspectRatio='1 1' applies.

It is up to the author to then specify the appropriate tts:origin and
tts:extent values on each defined region element to position the region as
desired within the root container region.

At present, TTML requires that tts:extent on the root tt element be
specified in pixels rather than percentages. However, the tts:extent on
region elements may be specified in pixels, ems, cells, or percentage. So,
an author could specify the outer root container extent using pixels, then
specify the non-root region extents either in percentage or in cells, which
is another way of expressing percentages, since each cell is effectively:

cell width = ( 1 / cellResolution(columns) ) * 100%
cell height = ( 1 / cellResolution(rows) ) * 100%

[see TTML 6.2.1 [3] for more information on ttp:cellResolution]

Now, you raise the issue of presenting captions authored for 4:3 on a 16:9
screen (and vice-versa). At present, TTML doesn't specify anything explicit
addressed to handle this situation. One option for a presentation processor
would be to scale the coordinate space used to position regions, which
would move regions to maintain relative positioning (but without font
scaling). Another would be to scale both region coordinates and fonts.
These operations may be modeled as performing TTML transformation
processing as a step prior to TTML presentation processing. In an actual
implementation, these steps could be merged.

If these mechanisms don't provide satisfactory results, then perhaps you
can explain what you would like to see supported. Since we are defining
TTML 1.1 at the current time, we have an opportunity to add new features.

[1]
https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml10/spec/ttaf1-dfxp.html#terms
[2]
https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml10/spec/ttaf1-dfxp.html#document-structure-vocabulary-tt
[3]
https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml10/spec/ttaf1-dfxp.html#parameter-attribute-cellResolution

Received on Wednesday, 30 January 2013 14:38:15 UTC