RE: ISSUE-179 (px measure): Interpreting the pixel measure [DFXP 1.0] from Michael A Dolan on 2012-08-26 (public-tt@w3.org from August 2012)

From: Michael A Dolan <mdolan@newtbt.com>
Date: Sun, 26 Aug 2012 15:52:33 -0700
To: "'Timed Text Working Group'" <public-tt@w3.org>
Message-ID: <009701cd83dd$7c7cd7e0$757687a0$@newtbt.com>

I think the only thing that makes sense is to align this with the video coding resolution, as Sean proposes. Device dots are, in most cases, unknown to the TTML author. But it is at least possible that the author would be aware of some companion video track coding resolution. I believe most users of TTML have already assumed this (and not device dots).

We need to be careful though and defer the definition of how one arrives at a video coding resolution to be application dependent.  Either that, or we require tts:extent on <tt> for any document that uses "px", effectively locking in the root container dimensions in term of "pixels".

Regards,

 Mike

-----Original Message-----
From: Timed Text Working Group Issue Tracker [mailto:sysbot+tracker@w3.org] 
Sent: Friday, August 24, 2012 4:18 AM
To: public-tt@w3.org
Subject: ISSUE-179 (px measure): Interpreting the pixel measure [DFXP 1.0]

ISSUE-179 (px measure): Interpreting the pixel measure [DFXP 1.0]

http://www.w3.org/AudioVideo/TT/tracker/issues/179

Raised by: Sean Hayes
On product: DFXP 1.0

We have defined the pixel measure with reference to XSL, namely:

The actual distance covered by the largest integer number of device dots (the size of a device dot is measured as the distance between dot centers) that spans a distance less-than-or-equal-to the distance specified by the arc-span rule in http://www.w3.org/TR/REC-CSS2//syndata.html#x39 or superceding errata.

or

a fixed conversion factor, treating 'px' as an absolute unit of measurement (such as 1/92" or 1/72").

This is leading to problems in practice as we use TTML with actual delivered video. I recall that during the development of the specification we discussed the concept that px would equate to video pixels, so that authors could align elements precisely with elements in the video however this seems to have been lost.

I suggest that we need to clarify the pixel behaviour; in particular we should explain that where extent is used on the tt element this effectively *defines* the size of a pixel in the sense that the root area extent is still mapped to a video overlay and divided into 'logical pixels'. If such an extent is not defined on the tt element, then the px measure should probably not be used (or even expressly forbidden)

Example, lets say that the author has a video nominally[1] 1024x768 in extent. This is being displayed however fullscreen on a monitor 1920x2000 (with 280px of black above and below the video). If we use device dots as XSL suggests, the captions will not be aligned correctly with the video.

If the px unit is used on the extent attribute on the tt element extent="1024px 768px" I believe the expectation was that the root element is scaled to 1920x1440 along with the video and placed in correspondence with the video, and so the px metric actually means 1.875 device dots.

[1] Note also that the pixel extent is not the actual delivered pixel density of the video either, since in an adaptive streaming model the actual frame size may vary depending on bandwidth. It needs to be an authoring concept based on the original coding size of the video.

Received on Sunday, 26 August 2012 22:53:03 UTC