RE: RE: Re: Re: TTML and aspect ratio from John Birch on 2013-02-04 (public-tt@w3.org from February 2013)

From: John Birch <John.Birch@screensystems.tv>
Date: Mon, 4 Feb 2013 10:42:49 +0000
To: Sean Hayes <Sean.Hayes@microsoft.com>, David Ronca <dronca@netflix.com>, "public-tt@w3.org" <public-tt@w3.org>
Message-ID: <0981DC6F684DE44FBE8E2602C456E8ABAAF042E1@SS-IP-EXMB-01.screensystems.tv>
Hi Sean,

Yes, it is perfectly possible to achieve the same visual outcome by redefining all the regions within a document, moving them all or re-sizing them all as required. But this requires 'a priori' knowledge of the display resolution.
Re-defining regions to achieve the same outcome implies that supporting alternate video resolutions at presentation requires **'document'** processing? Do you envisage this as a two stage process?, (adjust the regions and then render)?

However:
A) This does not disconnect the concept of a timed text plane from the video pixel plane... by using an aligned (matched scale) co-ordinate system you are forcing a recalculation of all co-ordinates if scaling or re-positioning  is required. This also raises the question of how to 'scale' font sizes if they are defined in points or ems? You may end up with fractional co-ordinates or rounding errors.
B) It is simpler to provide a single transformation by moving the origin point. Then all re-calculation of co-ordinates is performed in the presentation processor, NOT in a document context. Similarly, if you wish to re-scale the content (in either x or y independently) then a single element or attribute in the head of the document could indicate acceptable scaling / fitting strategies for content.
C) Scaling may be more visually acceptable if performed by a presentation processor drawing into a plane where the root container is defined in size by the document, and then scaled / located for presentation onto the video plane.

BTW I do agree that also having the ability to locate content outside the video area might be useful.

Consider the following use cases...

SD captions displayed over HD video:
Root_container_origin="0, 0" Scaling_strategy="scale to fit" would give a scaling up of the SD captions. They would be relatively wider than on SD.
**Is this the default outcome for TTML currently if the tt element has an SD extent?**

SD captions displayed over HD video:
Root_container_origin="240, 0" Scaling_strategy="maintain aspect ratio" would give a centred 4:3 presentation that is full height.

However, I find having to use an explicit pixel origin value to relocate the 'root container' a bit too specific... what happens if you do not know the presentation resolution? It might be better to have a syntax that allows the definition of the relationship between the root container and the underlying video that is more abstract. For example, "centre*s* aligned" might mean that the root container was centred over the video... scaling strategies like "scale to fit" or "maintain aspect ratio" would then work to expand (or contract) the timed text content across the video plane to its boundaries. Note in this case a scaling strategy of 1:1 would be required to position content outside the 'active video' and the document author would require a priori knowledge of the display resolution.

If you want to both scale against the video plane AND position content outside the active video area, then I believe you need a more sophisticated mechanism for defining the root container - effectively you need to define the origin and extent of the root container in an abstract manner with respect to the external active video, references inside the TTML then relate only to the co-ordinate system created by the root container.

So you could consider:

Root_container_anchor="top_left, top_centre, top_right, left, centre, centred, right_centre, bottom_left, bottom_centre, bottom_right"  (these match the 708 anchor points)
Root_container_bounds="no_scaling, scale_x, scale_y, scale_both"

Tts:extent on the tt element then defines the co-ordinates used within the root container (and outside the root container if negative values or values greater than 100% of extent are permitted)

YMMV.

Regards,
John

John Birch | Screen | Strategic Partnerships Manager
Main Line : +44 1473 831700 | Ext : 270 | Direct Dial : +44 1473 834532
Mobile : +44 7919 558380 | Fax : +44 1473 830078
John.Birch@screensystems.tv | www.screensystems.tv | https://twitter.com/screensystems

Visit us at
BVE, London Excel, 26-28 February, Stand J02

P Before printing, think about the environment-----Original Message-----
From: Sean Hayes [mailto:Sean.Hayes@microsoft.com]
Sent: 01 February 2013 16:35
To: Sean Hayes; John Birch; David Ronca; public-tt@w3.org
Subject: RE: RE: Re: Re: TTML and aspect ratio

s/ below the bottom of the frame/ above the bottom of the frame/

-----Original Message-----
From: Sean Hayes [mailto:Sean.Hayes@microsoft.com]
Sent: 01 February 2013 08:23
To: John Birch; David Ronca; public-tt@w3.org
Subject: RE: RE: Re: Re: TTML and aspect ratio

Well since one doesn't draw actually anything into the root container, except in very simple TTML; what is interesting to me is properly aligning the text content with the underlying video display.
Now in so much as the root container defines the coordinate system, then I believe the most general system is one that aligns top left 0,0 with 0,0 of the underlying media, and X,Y with the bottom right of the media. Since X and Y can be any real value, it doesn't really matter what value we use so 100%,100% seems adequate. If we then allow negative values and values beyond 100, we can address any relative position.

In such a coordinate system if I want centred left aligned text to appear just below the bottom of the frame; I create a region with alignment centre/bottom, and place it at 50% 90%; if I give it a width of N characters (or better yet define shrink fit regions - a topic for another day); then it works in any aspect ratio.
I can also see utility in being able to have conditional styles, as I may for example want longer text lines in a wide screen presentation (also a topic for another day).

I don't really see is the benefit of changing the 0,0 point, since any transformation I can do to the global coordinates, I can also do by a transform to the region coordinates, but perhaps I am missing something; perhaps you can explain the advantage of moving the root container.

-----Original Message-----
From: John Birch [mailto:John.Birch@screensystems.tv]
Sent: 01 February 2013 02:07
To: David Ronca; public-tt@w3.org
Subject: RE: RE: Re: Re: TTML and aspect ratio

RE Davids point below and Sean's comment:

"I believe that this scenario does add weight to the case for having tools to position drawing regions where the registration point is other than the top left; and this is open issue 21 http://www.w3.org/AudioVideo/TT/tracker/issues/21."

I disagree with the above... (unless by regions you mean the root container!).

I believe this adds weight to a case for having a mechanism to define the registration point of the root container with respect to any assumed underlying presentation. I.e. a means to specify where the root container origin is located.
Also desirable may be a mechanism to describe how co-ordinates within the root container are scaled against the external context. (e.g. 1:1, scaled to maintain AR, scaled to fit [current behaviour])

Regards,
John

John Birch | Screen | Strategic Partnerships Manager Main Line : +44 1473 831700 | Ext : 270 | Direct Dial : +44 1473 834532 Mobile : +44 7919 558380 | Fax : +44 1473 830078 John.Birch@screensystems.tv | www.screensystems.tv | https://twitter.com/screensystems

Visit us at
BVE, London Excel, 26-28 February, Stand J02

P Before printing, think about the environment-----Original Message-----
From: David Ronca [mailto:dronca@netflix.com]
Sent: 01 February 2013 01:56
To: public-tt@w3.org
Subject: RE: Re: Re: TTML and aspect ratio

Sean Hayes wrote: > What we do stipulate is that whatever mechanism is used, the caption root container should be 100% aligned to the presented video.

This still presents a difficulty.  We have received many 608 captions for content that is 16:9.  In this use case, the root container needs to be a 4:3 area that is the full height of the video window and width is 4/3 height, horizontally centered.



This message may contain confidential and/or privileged information. If you are not the intended recipient you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation. Screen Subtitling Systems Ltd. Registered in England No. 2596832. Registered Office: The Old Rectory, Claydon Church Lane, Claydon, Ipswich, Suffolk, IP6 0EQ
Received on Monday, 4 February 2013 10:43:23 UTC