Re: ISSUE-317 (IMSC should not require frame alignment): IMSC should not require frame alignment [TTML IMSC 1.0] from Nigel Megitt on 2014-05-23 (public-tt@w3.org from May 2014)

From: Nigel Megitt <nigel.megitt@bbc.co.uk>
Date: Fri, 23 May 2014 16:19:23 +0000
To: Pierre-Anthony Lemieux <pal@sandflow.com>
CC: Timed Text Working Group <public-tt@w3.org>
Message-ID: <CFA53330.1E48F%nigel.megitt@bbc.co.uk>
Hi Pierre,

>Hi Nigel,
>
>> 2. Video encoded for distribution at 7.5fps.
>
>Can you point to actual use case/deployment for this?

Yes - it's likely that adaptive streaming distribution mechanisms will do
this. Profiles for this purpose that go down to e.g. 25/4=6.25 fps. are
likely to be used for mobile devices for example. I assume that something
similar will result from a starting point of 30fps.

Kind regards,

Nigel

>On Fri, May 23, 2014 at 9:11 AM, Nigel Megitt <nigel.megitt@bbc.co.uk>
>wrote:
>> Hi Pierre,
>>
>> It would certainly help to explain the model more clearly, along the
>>lines
>> that you've outlined. The specific proposal wouldn't help though, since
>> the precise timing information would have been lost at the point of
>> temporal quantisation and could not be regenerated later.
>>
>> For example this suggests a chain such as:
>>
>> 1. IMSC Document authored against video at 30fps.
>> 2. Video encoded for distribution at 7.5fps.
>> 3. Receiving system must align the resolved TTML time expressions with
>>the
>> 7.5fps 'quanta' prior to display, as per the rule in the current spec.
>>
>> Even if the display device refreshes at 60fps it would be forbidden from
>> using the original timings because the spec references the encoded
>>video.
>>
>> What I'm trying to get to is a solution that is permitted (actually
>> encouraged) to align with display frames as late as possible while
>>losing
>> minimal information. In some real world systems that's unavoidably
>>earlier
>> than the display, but we shouldn't use the lowest common denominator to
>> set the rule for all implementations.
>>
>> Kind regards,
>>
>> Nigel
>>
>>
>> On 23/05/2014 16:33, "Pierre-Anthony Lemieux" <pal@sandflow.com> wrote:
>>
>>>Hi Nigel,
>>>
>>>IMSC 1.0 §4.4 [1] refers to synchronization with the related video
>>>object against which the timed text content is delivered, not
>>>synchronization to the displayed frame rate by the
>>>terminal/UA/device/display/TV. In other words, if a
>>>terminal/UA/device/display/TV chooses to alter the video frame rate of
>>>the related video object it receives (for whatever reason), then I
>>>expect it will accordingly alter the timed text display (perhaps along
>>>the lines of what is suggested below), with the knowledge that the
>>>timed text was authored according to the constraints of Section 4.4.
>>>
>>>Would a note to that effect help?
>>>
>>>Thanks,
>>>
>>>-- Pierre
>>>
>>>On Thu, May 22, 2014 at 3:19 AM, Timed Text Working Group Issue
>>>Tracker <sysbot+tracker@w3.org> wrote:
>>>> ISSUE-317 (IMSC should not require frame alignment): IMSC should not
>>>>require frame alignment [TTML IMSC 1.0]
>>>>
>>>> http://www.w3.org/AudioVideo/TT/tracker/issues/317
>>>>
>>>> Raised by: Nigel Megitt
>>>> On product: TTML IMSC 1.0
>>>>
>>>> IMSC 1.0 §4.4 [1] currently requires temporal quantisation of media
>>>>times to frame display times. This rule comes into play when times are
>>>>not expressed in frames, and therefore the same document may apply to a
>>>>range of related media objects covering different frame rates. In the
>>>>case when frames are used the document can only be displayed alongside
>>>>media of the same frame rate so there's no need for the frame alignment
>>>>expression.
>>>>
>>>> This approach prevents implementations from changing caption display
>>>>at
>>>>screen refresh rate quantisation and enforces quantisation based on the
>>>>encoded video frame rate. This means that if a low frame rate video is
>>>>provided, e.g. quarter rate which could be around 6 frames per second,
>>>>the effective word reading rate may be increased to the point where
>>>>text
>>>>becomes hard to read.
>>>>
>>>> Consider a streaming environment in which there is enough network
>>>>capacity to provide audio and captions but the video experience is
>>>>badly
>>>>impacted: in this case it must be permitted that the implementation
>>>>continue to present captions alongside the audio regardless of the
>>>>frames of video that are displayed.
>>>>
>>>> I propose a solution to this problem that implementations SHALL
>>>>display
>>>>captions as temporally close to the media time specified as the display
>>>>device permits, independent of video frame rate.
>>>>
>>>> Note that where frames are used in media time expressions this reduces
>>>>to exactly the current behaviour.
>>>>
>>>> [1]
>>>>https://dvcs.w3.org/hg/ttml/raw-file/ea1a92310a27/ttml-ww-profiles/ttml
>>>>-w
>>>>w-profiles.html#synchronization
>>>>
>>>>
>>>>
>>
Received on Friday, 23 May 2014 16:19:53 UTC