- From: Erik Hodge <ehodge@real.com>
- Date: Fri, 08 Aug 2003 10:05:47 -0700
- To: Johnb@screen.subtitling.com
- Cc: public-tt@w3.org
- Message-Id: <5.1.0.14.2.20030808100113.01990b90@mailone.real.com>
Well described, thanks. I'm curious what happens when the read-interval plus the add-interval exceed the desired duration of a subtitle, e.g., two people are talking and the first one says three lines of text very quickly, in say 3 seconds, followed immediatly by person 2 saying something. If the add-interval is .5seconds and the read-interval is 4 seconds, then the display of subsequent subtitle(s) will (a) fall behind by 1.5 seconds, or (b) be cut off. Thanks, - Erik At 10:44 AM 8/8/2003 +0100, Johnb@screen.subtitling.com wrote: >Erik, > >Part of the 'problem' with using TT-AF for subtitling is that there are >existing distribution formats for subtitling / captioning. There are also >existing standards (or at least strong conventions) as to how >captions/subtitles are displayed. In order to be a successful conveyance >for subtitle/caption information, TT-AF must be able to encode the current >author intended display effects - prior to the transfer of the carried >content into a distribution format. I do not believe that the current >style standards proposed for TT have the richness required to do this. It >**may** be possible within the TT-AF to explicitly define all of the >timing, and define multiple style rules in such a manner as to achieve >some of the effects desired - but it will be no improvement over the >existing 'proprietary' standards. In fact it is likely to be considerably >harder - as these standards intrinsically support the appropriate display >concepts. Two important considerations are: > >1) Subtitles/captions are designed to fit within a fixed size region. It >is truer to say that the subtitle fits the block - rather than the block >fits the content. conceptual difference from most web styling concepts. In >subtitling/captioning - the region can only be of limited dimensions - >font sizes are restricted, etc. > >2) The entire process of subtitling/ captioning is the control of temporal >overflow - basically by >a) reducing the amount of content (word substitution, excision of >irrelvant or superfluous text, etc) >and b) by spreading the content in the time direction - using an >understanding of the reading speed of the target audience and an >understanding of where it is appropriate to break the text. > >There are a number of ways of typically displaying subtitle/captions - >I'll outline some of them descriptively below: >(Please note this is not an exhaustive list) > >normal 'pop' modes: > >Typically - subtitles are displayed 'in toto' for a reading interval, then >a short no subtitle visible period occurs, then the next subtitle is >shown. The subtitle region will sometimes vary in size to suit the amount >of text displayed - and typically the spacing of the subtitles is kept >fairly regular (it must be remembered this is a human edited process - so >there is a degree of variation - part of the art of good subtitling is >maintaining the reading 'flow'). > >So you might have a mixture of two and three line subtitles (typically >more two line than three), on screen for 4 to 5 seconds, spaced by 0.5 >second intervals. > >line-by-line modes: >The subtitle region may be filled a line at a time - each line added, >possibly with existing content moving to make space, after successive time >intervals, until it reaches a fill limit (e.g. three lines). At this >point, after a reading interval for the last line the region may clear and >the process restarts. > >Alternatively - once a region is full, the top line may be removed and all >the content shifts up - then the new line(s) insert underneath (assuming >western writing mode...) >This would continue until a significant pause in the subtitling - when the >subtitle clears. > >snake modes >A variation on the line by line ideas - where the added content is in >words or fragments. Typically the snake fills the line, then the region >acts as if in line-by-line mode (i.e. the lines move up). > > >===================================================== > >Fundamentally however, this issue comes down to a couple of questions: > >a) Should / will TT-AF support temporal flow - i.e. a relaxed >(non-explicit) mechanism for placing text content into a region over time. >b) If a) is yes - then how is the concept best supported. My personal view >is that what should be developed are a set of attributes/elements that >allow the definition of temporal-overflow. Some candidates might include: > >fill-direction - regardless of writing mode - in subtitling/captioning - >regions are filled from different directions depending on where they are >on the screen. E.g a top of screen subtitle will use the uppermost line >first - then the second etc... Conversely a bottom of screen subtitle will >use the lowest line first - then the bottom two lines etc. This is to >minimise the intrusion of the subtitle into the central picture area. The >UA would need a 'hint' in order to decide which direction is appropriate. > >fill-mode - basically the size of content used when filling a region - >e.g. all | line | word | fragment. >region-full-clear - is the region cleared when it fills - or does content >shift to make space - and by what extent (none | all | line | word | fragment) >add-interval - A desired (target) interval between additions (auto | value) >read-interval - The desired (target read-interval) - how long the last >content must 'hang' to allow reading. >tidemark - A subtle wrinkle - you may wish to nominally have just two line >subtitles - but allow three liners if the amount of content demands it. >The tidemark would define when to typically consider a clear down in pop >mode - but might be overwritten by the content / time demands. >Of course these concepts are not just limited to TT-AF for subtitling / >captioning - but have application in many other areas.... > >regards >John Birch > >The views and opinions expressed are the author's own and do not necessarily >reflect the views and opinions of Screen Subtitling Systems Limited. >-----Original Message----- >From: Erik Hodge [mailto:ehodge@real.com] >Sent: 07 August 2003 17:45 >To: Johnb@screen.subtitling.com; glenn@xfsi.com >Cc: public-tt@w3.org >Subject: RE: TT and subtitling/captioning - temporal flow of content > >3GPP Timed Text uses an overall duration for display of a block of text >along with a scroll-in + delay + scroll-out. The (optional) scroll-in and >(optional) scroll-out each do not have explicit duration but rather their >durations are calculated using the text's duration minus the delay. This, >I think, would work for what you need, although it sounds like you'd want >possibly multiple delay periods based on the number of lines of text (tl) >and the number of lines of display (dl). If there was a total delay time >(d) then the number of delay periods (pd), spread evenly throughout the >total duration, would be tl/dl rounded up to the nearest whole number, and >the delay of each would be pd/d. > > - Erik > >At 05:54 PM 8/7/2003 +0100, Johnb@screen.subtitling.com wrote: >>Glenn, >> >>You wrote: >> >>I'm afraid I'm still not following your description. Could you try to put >>together a example of what you mean using some of the vocabulary we have >>been describing? If you could create some images of how it would look >>over time, then I could understand better. >>G. >>[JB> ] Ok - tall order - but I'll try..... >> >>Starting with a piece of text from which I have deliberately removed the >>line breaks etc. Note the time constraint, in-cue before out-cue after. >> >>00:02:43.70 >>Ladies and gentlemen, Ladies and gentlemen! I want to congratulate each >>and every one of you for making this one of the greatest years in the >>history of the Nakatomi Corporation. On behalf of the Chief Executive >>Officer, Mr Ozu, and the Board of Directors, we thank you one and all and >>wish you a merry Christmas and a happy New Year! >>00:03:08.63 >> >>Duration of entire section is approximately 25 seconds. >> >>Now this is a subtitle/caption to be displayed (using Teletext) on a two >>row subtitle/caption region. Each Teletext row only holds 37 active >>characters in double height white. We can't grow the region. >> >>So what my ideal UA would do is flow this text into the region according >>to certain rules. >> >>Rule 1 - content should be displayed long enough to be read. >> >>Implication is that last added content must 'hang' for a period. We >>should work backwards from the outcue when determining the interim timings. >> >>Lets posit a read time of 3 seconds for a two line subtitle. >> >> From content alone and the encompassing period we can work out. >> >>Maximum of 37 X 2 characters displayed per refresh of subtitle/caption = >>74 characters. >> >>Above text is 62 words, 334 characters including spaces (or so MS word >>tells me) >> >>so 334 / 62 = 5.5 refreshes of the region to display all the content. We >>can't have half a refresh - so 6 unique display occurences of the region. >> >>25 seconds divided by six gives us approximately 4 seconds per display - >>which fits the reading time nicely. >> >>We probably want control over the mark space ratio (i.e. the on air - >>off-air timing for the region) - typically to 'notify' the reder that the >>content has changed a small gap is left between displays. >> >>Ok that roughly covers the temporal flow..... but there are other aspects >>concerned with how the content is put into the region. >> >>The above assumes that the content is all presented simultaneously as a >>full region... there are a number of alternative ways of filling a nd >>clearing the region throughout the 25 seconds. e.g. >> >>line-by-line. >>word-by-word. >>character by character. >> >>Further I have assumed that the region is cleared and refilled (pop >>mode), but it is equally valid to consider cases where new content >>displaces existing content (i.e. pushes it out - push mode). >> >>regards >>John Birch >> >>The views and opinions expressed are the author's own and do not >>necessarily >>reflect the views and opinions of Screen Subtitling Systems Limited. >>Glenn, >>Tackling just the temporal flow issue - I'm still digesting the style >>separation feedback..... >>A second question.... >>It would be desirable for TT (at least IMHO) to include mechanisms for >>describing the temporal breaking of content. >>What I am thinking of is a document that does not describe explicitly the >>timing for all of the content >>- but rather describes that X amount of content fits into a box of size Y >>over a time period of Z. >>Now if the content X is too large for box Y - how does the content get >>over(?)flowed in a 'temporal sense' through the box. >>I'm not sure I'm following your scenario here. Are you saying you want >>individual characters, words, lines, etc. to appear in box Y over time, >>and do so without explicitly timing each unit? >>[JB> ] That's exactly it. No explicit timing - but an overall timing. For >>example timing is specified for a paragraph of text (multiple lines) to >>be 'rendered' into a nominally single line region over that time period. >>If so, I can see some possible problems, such as (1) needing to specify >>the granularity of content to be timed (i.e., character, word, etc.); (2) >>which would entail the need to formally specify how to subdivide content >>lacking markup into such units. >>[JB> ] Yes - it would - but this is what I see as part of the essence of >>timed text - a description of the behaviour of text over time. >>While this might make the content of a TT-AF file smaller, >>[JB> ] This isn't a size of file issue - rather it's a usability issue. >>By being able to specify how you want the user agent to react in >>situations of overflow - by spreading the text temporally cf (as well as) >>the CSS scroll / marquee concepts, I see the following advantages: >>It allows a faster authoring of content. >>It also potentially allows the creation of style templates that work more >>universally for text - they need not be so tied to specific text. >>A user agent that is able to take the role of distributing text over time >>would produce more consistent results. >>The translation of one langauge to another need not involve a 'knife and >>fork' re-edit of the file contents. >>it would also be possible to do this by animating the visibility property >>of individual units explicitly, making decisions about what constitute >>units at authoring time, e.g., >>[JB> ] Snip 'knife and fork' explicitly timed example. >>Yes but this example has explicit timing. If the text is modified in >>length - you have to modify the timing. Different language (or reading >>level)instances of a given text content will differ in length, yet in a >>subtitling scenario - and many others I suspect - they will be >>constrained to display within the same specific display period that >>cannot be extended. Ideally TT-Af would allow the modification (or >>substitution) of content without the explicit requirement to adjust the >>number of, and timing of multiple cue elements.
Received on Friday, 8 August 2003 13:01:52 UTC