RE: Why use time as a unit of measurement? (was: Proposal 0.0) from Johnb@screen.subtitling.com on 2003-02-17 (public-tt@w3.org from February 2003)

From: <Johnb@screen.subtitling.com>
Date: Mon, 17 Feb 2003 15:50:16 -0000
To: singer@apple.com, public-tt@w3.org
Message-ID: <11E58A66B922D511AFB600A0244A722E093EDF@NTMAIL>

Dave, lurkers, et al.

All my comments solely in the context of broadcast subtitling :-)

Dave outlined two scenarios:

> Scenario A:  a timed-text stream is treated the same as an audio 
> stream or a video stream;  it has a natural (intrinsic) duration, and 
> can be layed up in parallel with other such streams.  In this case a 
> SMIL 'par' construct works fine, just like it does for audio with 
> video, and duration-based timing makes sense also for each 'element' 
> in the text stream.

For broadcast subtitling - this is the presentation scenario. 
I.e. This is what happens at the decoder (be it CC, Teletext, Open, DVD or
DVB).
 
> Scenario B:  text elements must appear when triggered by an external 
> stimulus, such as the correct time-code appearing.  In this case, the 
> text elements can NOT be treated as concatenated elements with 
> duration;  each is, in some sense, a little presentation in its own 
> right.  The right SMIL construct here is a 'par' with the video of a 
> sequence of little text constructs, each of which has a start 
> 'trigger' that is defined.

For broadcast subtitling - this is the authoring | insertion scenario. 
In authoring you want the freedom to manipulate the text constructs
independently. 
Once authoring is complete the TT is kept separate from the audio visual.
It is married back together just prior to transmission, but this union has
to take account of
any changes to the AV material since the TT was created. 
AV could be shortened (e.g. censorship) or have other material inserted
to extend it's duration (e.g. Ads, news).

> In both cases, I believe that the synchronization 'layup' can and 
> should be represented at the level that provides that (e.g. SMIL).

Wholeheartedly agree: BUT can SMIL represent such an 
externally synchronised timeline (scenarion b) in an appropriate manner?

1 hour of broadcast = 1000 subtitles (ROT). Each subtitle approx 10 words
(culturally dependent!).
For snake or add-on each word is timed individually. So 20,000 timings (in
cue and out cue).

> If the 'seq' construct in timed text (assuming there was such a 
> thing) had a 'trigger' construct which relied on the video time-code, 
> there is a layering problem;  that 'video' is not talked about in the 
> pure timed-text document.

Yes, it's not the 'video' that needs inclusion - it's the marked up timeline
of the video that needs inclusion.

> However, it seems that if we use XML it might be possible to 
> construct a document which uses both the SMIL and timed text 
> namespaces, and which combines the timed text element definitions and 
> the SMIL layup in one XML document.  This would have other advantages 
> too (it's already a pain the number of loads/stream-opens one has to 
> do with SMIL, and it ought to be possible to embed the text in the 
> SMIL to avoid yet more for small text captions etc.).

Certain subtitle styles require control of the timing of the display of
single words. 
(Ideally char by char control should be possible).

I think this is getting close to solving my requirements :-)

regards 
John Birch

The views and opinions expressed are the author's own and do not necessarily
reflect the views and opinions of Screen Subtitling Systems Limited.

Received on Monday, 17 February 2003 10:40:49 UTC