RE: Proposal 0.0

> I wrote...
 
> > The point I am trying to make wrt to SMIL timing models is:  How do you
> > synchronise the 'presentation' of the text with an external often
discontinuous timebase.

> It's pretty trivial.  Let's say you want to show the word "foo" for 5
> seconds, 10 seconds from the start of a commercial.  Still assuming
> Proposal 0.0, it would look something like:
 
> <seq>
>    <tt:p begin="10s" dur="5s">foo</tt:p>
> </seq>
 
> If the commercial is played at 01:03:28.720 (we don't need to know this
time
> a priori), then the text appears at 01:03:38.720 and ends at 01:03:43.720.

OK, I can see that you're not getting what I'm trying to explain :-( 

For the purposes of the following we'll assume that a 'tape' is used for
video storage.
Forgive me if this appears to be a dumbed down explanation.

Currently subtitling works as follows.
Broadcaster decides to show a program X - he prepares a master 'tape' TapeX.
TapeX is sent to an external subtitling agency who prepare a subtitle 'file'
FileX.
FileX is indexed against the timecode that is stored as VBI data on TapeX.
So unlike SMIL - the media are separate - subtitles in one - video and audio
on another.
Broadcaster plays TapeX on a tape machine on a number of different occasions
(i.e at different wall clock times).
On each occassion the tape is played - the timecode off the tape will be
identical in sequence.
The subtitle inserter (having had the correct FileX selected by an
automation system) continuously reads the timecode from the input video
signal generated by the tape machine playing tapeX. When an input timecode
matches a timecode in fileX it starts inserting the matching subtitle, by
modifying the video signal. When the timecode on the input matches the end
time for that subtitle - it stops showing the subtitle. The inserter is not
playing the video media - it is watching it and responding to it. In SMIL,
the UA is orchestrating all the components of the presentation.

Ok - so far you could get away with duration based subtitling - but it is
much more convenient to have the absolute off air time in the subtitle file,
since that is what you are matching against. 

In practice, the broadcaster will want to show adverts. These can occur at
**any time** during the broadcast, and may differ from showing to showing,
but when they occur they can be detected by the subtitle inserter by a
change in the timecode it is reading (by convention adverts are timed with a
different start time to programs....[typically programs are 10 hours+, ads
are 1 hours+]. The program may or may not be created with specific ad insert
points in mind, regardless - the broadcaster often goes to air with an
advert while a subtitle is still on screen (i.e. it is part way through its
duration). This can occur because advert insert points are often timed to
occur literally just after a significant burst of dialogue (subtitles
generally have a longer duration for presentation than the corresponding
audio). So you need to be able to off air the subtitle **early**.

In your above example you say :
"Let's say you want to show the word "foo" for 5 seconds, 10 seconds from
the start of a commercial."

But the point is that, when the program restarts after an advert, I want to
show the subtitles at the correct point with respect to the timecode on the
program material. Subtitles are frame accurate for lip synching. I can't use
10 seconds after the commercial - how does that relate to where I am in the
program? What if TapeX has been advanced half a second or two while the
commercial break occurred? This is what timecode on video is for -
synchronisation.

By using absolute tape times in the subtitle file - the subtitle inserter
has a simple task. Match timecodes and insert. It doesn't need to be
informed when a commercial occurs - it is resilient to changes in the
timecode in the input signal.

I fear this still doesn't explain adequately why a duration based, relative
timing model is inappropriate for broadcast subtitling... but perhaps we are
getting hung up on broadcast. I can conceive of other situations where the
assumption of 1 sec per sec is invalid. Timed text implies duration - but in
practice I believe timed text is better supported by an event driven model. 

SMIL already adequately covers duration based text presentation.

> > SMIL has wallclock timebase - but that's no use.

> Time is the /only/ thing that's of use, believe me.  You may believe that
> God's Atomic Unit of Time is 1/29.97th of a second, but George Lucas' is
> 1/24th of a second, and someone encoding PAL source for the web might
> consider it to be 1/12.5th of a second.

Sarcasm aside, timebases can always be converted into a single format. The
issue is the resolution of the timebase.

Anyway, in practice the standards you mention above are actually event
driven. A film may be designed to be played at 1/25th of a second - but when
played at 1/30th it is faster, but still self consistent, so is a stopped
frame. What would happen with a subtitle under proposal 0.0 if the tape was
stopped? In 'real life' the subtitle should remain until the tape restarts
and moves past the off air time. In broadcast the frame is king - not the
time ;-)

regards 
John Birch

The views and opinions expressed are the author's own and do not necessarily
reflect the views and opinions of Screen Subtitling Systems Limited.

Received on Tuesday, 11 February 2003 05:33:38 UTC