Re: ISSUE-317 (IMSC should not require frame alignment): IMSC should not require frame alignment [TTML IMSC 1.0] from Pierre-Anthony Lemieux on 2014-05-23 (public-tt@w3.org from May 2014)

From: Pierre-Anthony Lemieux <pal@sandflow.com>
Date: Fri, 23 May 2014 09:20:43 -0700
To: Nigel Megitt <nigel.megitt@bbc.co.uk>
Cc: Timed Text Working Group <public-tt@w3.org>
Message-ID: <CAF_7JxBbRKr136KGmruerG3ZmJ6Xiyvmq9=xEpgf5fNCGw65Dw@mail.gmail.com>
> Yes - it's likely that adaptive streaming distribution mechanisms
> will do this.

Do they do this today?

Best,

-- Pierre

On Fri, May 23, 2014 at 9:19 AM, Nigel Megitt <nigel.megitt@bbc.co.uk> wrote:
> Hi Pierre,
>
>>Hi Nigel,
>>
>>> 2. Video encoded for distribution at 7.5fps.
>>
>>Can you point to actual use case/deployment for this?
>
> Yes - it's likely that adaptive streaming distribution mechanisms will do
> this. Profiles for this purpose that go down to e.g. 25/4=6.25 fps. are
> likely to be used for mobile devices for example. I assume that something
> similar will result from a starting point of 30fps.
>
> Kind regards,
>
> Nigel
>
>>On Fri, May 23, 2014 at 9:11 AM, Nigel Megitt <nigel.megitt@bbc.co.uk>
>>wrote:
>>> Hi Pierre,
>>>
>>> It would certainly help to explain the model more clearly, along the
>>>lines
>>> that you've outlined. The specific proposal wouldn't help though, since
>>> the precise timing information would have been lost at the point of
>>> temporal quantisation and could not be regenerated later.
>>>
>>> For example this suggests a chain such as:
>>>
>>> 1. IMSC Document authored against video at 30fps.
>>> 2. Video encoded for distribution at 7.5fps.
>>> 3. Receiving system must align the resolved TTML time expressions with
>>>the
>>> 7.5fps 'quanta' prior to display, as per the rule in the current spec.
>>>
>>> Even if the display device refreshes at 60fps it would be forbidden from
>>> using the original timings because the spec references the encoded
>>>video.
>>>
>>> What I'm trying to get to is a solution that is permitted (actually
>>> encouraged) to align with display frames as late as possible while
>>>losing
>>> minimal information. In some real world systems that's unavoidably
>>>earlier
>>> than the display, but we shouldn't use the lowest common denominator to
>>> set the rule for all implementations.
>>>
>>> Kind regards,
>>>
>>> Nigel
>>>
>>>
>>> On 23/05/2014 16:33, "Pierre-Anthony Lemieux" <pal@sandflow.com> wrote:
>>>
>>>>Hi Nigel,
>>>>
>>>>IMSC 1.0 §4.4 [1] refers to synchronization with the related video
>>>>object against which the timed text content is delivered, not
>>>>synchronization to the displayed frame rate by the
>>>>terminal/UA/device/display/TV. In other words, if a
>>>>terminal/UA/device/display/TV chooses to alter the video frame rate of
>>>>the related video object it receives (for whatever reason), then I
>>>>expect it will accordingly alter the timed text display (perhaps along
>>>>the lines of what is suggested below), with the knowledge that the
>>>>timed text was authored according to the constraints of Section 4.4.
>>>>
>>>>Would a note to that effect help?
>>>>
>>>>Thanks,
>>>>
>>>>-- Pierre
>>>>
>>>>On Thu, May 22, 2014 at 3:19 AM, Timed Text Working Group Issue
>>>>Tracker <sysbot+tracker@w3.org> wrote:
>>>>> ISSUE-317 (IMSC should not require frame alignment): IMSC should not
>>>>>require frame alignment [TTML IMSC 1.0]
>>>>>
>>>>> http://www.w3.org/AudioVideo/TT/tracker/issues/317
>>>>>
>>>>> Raised by: Nigel Megitt
>>>>> On product: TTML IMSC 1.0
>>>>>
>>>>> IMSC 1.0 §4.4 [1] currently requires temporal quantisation of media
>>>>>times to frame display times. This rule comes into play when times are
>>>>>not expressed in frames, and therefore the same document may apply to a
>>>>>range of related media objects covering different frame rates. In the
>>>>>case when frames are used the document can only be displayed alongside
>>>>>media of the same frame rate so there's no need for the frame alignment
>>>>>expression.
>>>>>
>>>>> This approach prevents implementations from changing caption display
>>>>>at
>>>>>screen refresh rate quantisation and enforces quantisation based on the
>>>>>encoded video frame rate. This means that if a low frame rate video is
>>>>>provided, e.g. quarter rate which could be around 6 frames per second,
>>>>>the effective word reading rate may be increased to the point where
>>>>>text
>>>>>becomes hard to read.
>>>>>
>>>>> Consider a streaming environment in which there is enough network
>>>>>capacity to provide audio and captions but the video experience is
>>>>>badly
>>>>>impacted: in this case it must be permitted that the implementation
>>>>>continue to present captions alongside the audio regardless of the
>>>>>frames of video that are displayed.
>>>>>
>>>>> I propose a solution to this problem that implementations SHALL
>>>>>display
>>>>>captions as temporally close to the media time specified as the display
>>>>>device permits, independent of video frame rate.
>>>>>
>>>>> Note that where frames are used in media time expressions this reduces
>>>>>to exactly the current behaviour.
>>>>>
>>>>> [1]
>>>>>https://dvcs.w3.org/hg/ttml/raw-file/ea1a92310a27/ttml-ww-profiles/ttml
>>>>>-w
>>>>>w-profiles.html#synchronization
>>>>>
>>>>>
>>>>>
>>>
>
Received on Friday, 23 May 2014 16:21:32 UTC