Timed text markup requirements

At BBC R&D we're developing new techniques for preparing subtitles/captions
and the need for a timed text markup arose early on in this work. We're
using an xml-based format that we came up with, to carry not only timing
data but other information that we need to produce subtitles from timed text.

Our requirements, roughly speaking, fall into three categories: 

a) timing data and other information about the original text that goes into
the subtitles (there's nothing here that is specific to the final subtitles
that are produced).

b) timing and other data defining the subtitles that are created from the
text (of which there might be many variants)

c) other data that link the subtitles to the video

To elaborate, some in category (a) are:
1) the names of each speaker in the programme
2) the default text colour assigned to them
3) the text, marked with scenes and speakers names
4) the timing information for each word, i.e. start and end times
5) any changes of text colour for a speaker from their default (should it
be necessary to change the text colour temporarily to ensure the subtitle
remains unambiguous)

in category (b) we have:
6) the words that are in each subtitle, line by line
7) the in- and out-time for each subtitle
8) the subtitle type and position
9) foreground and background colours, possibly changing word by word

...and in (c):
10) data linking to the video, for example, frame rate, timing reference
point (this could be simply the start timecode of the video but there are
one or two other timing issues here, especially with compressed bitstreams)
11) timings of shot changes in the video. I don't know about the US but in
the UK, subtitle in/out times are anchored to nearby shot changes so we
need those timings too if we are to produce subtitles automatically from
timed text.

The markup specification that we are using also includes other data but
perhaps that's getting more specific to our application. However, the point
is that for the production of subtitles/captions we need many other entries
beyond the timing data for the subtitles themselves. I don't know if any
combination of existing standards can achieve this.

On the question of standards that may be relevant, here are three:

EBU Tech. 3264-E "Specification of the EBU subtitling data exchange format"
This is the subtitle exchange format defined by the EBU (European
Broadcasting Union). As mentioned already on the list, work at
http://lithpc17.epfl.ch/stlml/ produced an xml markup for this format but
that doesn't give the flexibility that we need with timed text.

"Digital Video Broadcast: Subtitle File Transfer Format" from the European
Telecommunications Standards Institute (may still only be in draft form).
This describes the file format for transfering files containing DVB
subtitles between preparation and transmission. It's an evolution of the
EBU 3264 file format that is currently being standardised.

EIA-608
Already mentioned on the list. This defines the line-21 transmission format
but it doesn't define a storage format for the captions themselves. As I
understand it, the various file formats used to store line-21 captions are
proprietary and closely guarded! 


Our own requirements are focussed more on the authoring of
subtitles/captions rather than the markup of finished caption text.
Although this brings in more requirements, a standardised format for timed
text from which the different delivery formats (teletext/line-21/DVD etc.)
can be produced, would be much more useful.


David Kirby
--
David Kirby
Project Manager
BBC Research and Development      
Kingswood Warren                  Tel: +44 1737 839623
Tadworth, Surrey.                 Fax: +44 1737 839665
KT20 6NP, UK.                     email: david.kirby@rd.bbc.co.uk

Received on Friday, 1 February 2002 09:35:49 UTC