RE: TT and subtitling/captioning - temporal flow of content from Erik Hodge on 2003-08-07 (public-tt@w3.org from August 2003)

From: Erik Hodge <ehodge@real.com>
Date: Thu, 07 Aug 2003 10:44:57 -0700
To: Johnb@screen.subtitling.com, glenn@xfsi.com
Cc: public-tt@w3.org
Message-Id: <5.1.0.14.2.20030807103645.0435b978@mailone.real.com>
3GPP Timed Text uses an overall duration for display of a block of text 
along with a scroll-in + delay + scroll-out.  The (optional) scroll-in and 
(optional) scroll-out each do not have explicit duration but rather their 
durations are calculated using the text's duration minus the delay.  This, 
I think, would work for what you need, although it sounds like you'd want 
possibly multiple delay periods based on the number of lines of text (tl) 
and the number of lines of display (dl).  If there was a total delay time 
(d) then the number of delay periods (pd), spread evenly throughout the 
total duration, would be tl/dl rounded up to the nearest whole number, and 
the delay of each would be pd/d.

         - Erik

At 05:54 PM 8/7/2003 +0100, Johnb@screen.subtitling.com wrote:
>Glenn,
>
>You wrote:
>
>I'm afraid I'm still not following your description. Could you try to put 
>together a example of what you mean using some of the vocabulary we have 
>been describing? If you could create some images of how it would look over 
>time, then I could understand better.
>G.
>
>[JB> ]  Ok - tall order - but I'll try.....
>
>Starting with a piece of text from which I have deliberately removed the 
>line breaks etc. Note the time constraint, in-cue before out-cue after.
>
>00:02:43.70
>Ladies and gentlemen, Ladies and gentlemen! I want to congratulate each 
>and every one of you for making this one of the greatest years in the 
>history of the Nakatomi Corporation. On behalf of the Chief Executive 
>Officer, Mr Ozu, and the Board of Directors, we thank you one and all and 
>wish you a merry Christmas and a happy New Year!
>00:03:08.63
>
>Duration of entire section is approximately 25 seconds.
>
>Now this is a subtitle/caption to be displayed (using Teletext) on a two 
>row subtitle/caption region. Each Teletext row only holds 37 active 
>characters in double height white. We can't grow the region.
>
>So what my ideal UA would do is flow this text into the region according 
>to certain rules.
>
>Rule 1 - content should be displayed long enough to be read.
>
>Implication is that last added content must 'hang' for a period. We should 
>work backwards from the outcue when determining the interim timings.
>
>Lets posit a read time of 3 seconds for a two line subtitle.
>
> From content alone and the encompassing period we can work out.
>
>Maximum of 37 X 2 characters displayed per refresh of subtitle/caption = 
>74 characters.
>
>Above text is 62 words, 334 characters including spaces (or so MS word 
>tells me)
>
>so 334 / 62 = 5.5 refreshes of the region to display all the content. We 
>can't have half a refresh - so 6 unique display occurences of the region.
>
>25 seconds divided by six gives us approximately 4 seconds per display - 
>which fits the reading time nicely.
>
>We probably want control over the mark space ratio (i.e. the on air - 
>off-air timing for the region) - typically to 'notify' the reder that the 
>content has changed a small gap is left between displays.
>
>Ok that roughly covers the temporal flow..... but there are other aspects 
>concerned with how the content is put into the region.
>
>The above assumes that the content is all presented simultaneously as a 
>full region... there are a number of alternative ways of filling a nd 
>clearing the region throughout the 25 seconds. e.g.
>
>line-by-line.
>word-by-word.
>character by character.
>
>Further I have assumed that the region is cleared and refilled (pop mode), 
>but it is equally valid to consider cases where new content displaces 
>existing content (i.e. pushes it out - push mode).
>
>
>regards
>John Birch
>
>The views and opinions expressed are the author's own and do not necessarily
>reflect the views and opinions of Screen Subtitling Systems Limited.
>Glenn,
>
>Tackling just the temporal flow issue - I'm still digesting the style 
>separation feedback.....
>A second question....
>
>It would be desirable for TT (at least IMHO) to include mechanisms for 
>describing the temporal breaking of content.
>What I am thinking of is a document that does not describe explicitly the 
>timing for all of the content
>- but rather describes that X amount of content fits into a box of size Y 
>over a time period of Z.
>Now if the content X is too large for box Y - how does the content get 
>over(?)flowed in a 'temporal sense' through the box.
>I'm not sure I'm following your scenario here. Are you saying you want 
>individual characters, words, lines, etc. to appear in box Y over time, 
>and do so without explicitly timing each unit?
>[JB> ] That's exactly it. No explicit timing - but an overall timing. For 
>example timing is specified for a paragraph of text (multiple lines) to be 
>'rendered' into a nominally single line region over that time period.
>If so, I can see some possible problems, such as (1) needing to specify 
>the granularity of content to be timed (i.e., character, word, etc.); (2) 
>which would entail the need to formally specify how to subdivide content 
>lacking markup into such units.
>[JB> ] Yes - it would - but this is what I see as part of the essence of 
>timed text - a description of the behaviour of text over time.
>While this might make the content of a TT-AF file smaller,
>[JB> ] This isn't a size of file issue - rather it's a usability issue. By 
>being able to specify how you want the user agent to react in situations 
>of overflow - by spreading the text temporally cf (as well as) the CSS 
>scroll / marquee concepts, I see the following advantages:
>It allows a faster authoring of content.
>It also potentially allows the creation of style templates that work more 
>universally for text - they need not be so tied to specific text.
>A user agent that is able to take the role of distributing text over time 
>would produce more consistent results.
>The translation of one langauge to another need not involve a 'knife and 
>fork' re-edit of the file contents.
>it would also be possible to do this by animating the visibility property 
>of individual units explicitly, making decisions about what constitute 
>units at authoring time, e.g.,
>[JB> ] Snip 'knife and fork' explicitly timed example.
>Yes but this example has explicit timing. If the text is modified in 
>length - you have to modify the timing. Different language (or reading 
>level)instances of a given text content will differ in length, yet in a 
>subtitling scenario - and many others I suspect - they will be constrained 
>to display within the same specific display period that cannot be 
>extended. Ideally TT-Af would allow the modification (or substitution) of 
>content without the explicit requirement to adjust the number of, and 
>timing of multiple cue elements.
Received on Thursday, 7 August 2003 13:43:37 UTC