W3C home > Mailing lists > Public > public-tt@w3.org > January 2003

RE: TT and subtitling

From: Glenn A. Adams <glenn@xfsi.com>
Date: Fri, 31 Jan 2003 11:00:06 -0500
Message-ID: <7249D02C4D2DFD4D80F2E040E8CAF37C01FA9A@longxuyen.xfsi.com>
To: <Johnb@screen.subtitling.com>, <public-tt@w3.org>
See inline.

	-----Original Message-----
	From: Johnb@screen.subtitling.com [mailto:Johnb@screen.subtitling.com] 
	Sent: Friday, January 31, 2003 7:14 AM
	To: public-tt@w3.org
	Subject: TT and subtitling

	A more subtle problem occurs in DVB subtitling (STREAMING, SYNCHRONIZED). Common practice for DVB subtitling is to use file playout for subtitles - with rendering to DVB bitmap just prior to transmission. In DVB subtitling - the subtitle is transmitted ahead of presentation time by as much as 5 seconds (due to limitations on bandwidth allocation to subtitle data). Unfortunately it is uncommon for broadcasters (specifically re-distributors using subtitling for localisation) to know when an ad break is about to occur as these are often triggered by cues embedded in the incoming TV program. Consequently a subtitle may be on the wire when an advert break occurs. This can lead to a nasty effect where a subtitle from the interrupted program hangs over the start of an advert. This is very undesirable from the advertisers point of view - and the broadcaster as he loses revenue for spoiled adverts. There is no real **effective** mechanism to resolve this problem. 

	Is this a problem in the DVB subtitle format or is this merely
	an operational constraint based on broadcaster unwillingness to
	assign sufficient bandwidth, and thus force pre-delivery? 

	        >I would see no reason not to be able to support both duration and overwrite 
	        >approaches though, it should be simple enough. 

	While for subtitling purposes a duration based approach solves certain problems, it does not solve all problems caused by interruption of the stream or change of intent. It is certainly preferable to an overwriting approach. 

	Could you provide more detail of why overwriting is preferred? 

	Jason Terando wrote: 

	        >I'm just joining in, so excuse me for any non-sequiturs, redundant points or simplistic questions. 

	I've already set a low point for these I fear :-)  
	Enough apologetics already, we're all humble folk here... :-) 
	        >Some devices/applications (consumers?) are going to use formatting characteristics that other consumers won't.  

	        >Closed captioning (at least EIA-608) doesn't really deal with the concepts of fonts, unlike subtitling and most "Internet media" file formats.

	In general, support for fonts in subtitling is limited. Our proprietary standard only provides for two fonts within a 'program' subtitle file. Further the actual fonts used for display are severely restricted by readability issues and in some cases regulatory issues. Fonts and WYSIWYG remain a major headache in the subtitling arena. 

	I would personally like to see us provide some sort of support
	for downloadable fonts. I see the lack of such support to be
	a barrier to internationalization as well as service for minority
	communities in markets where the default fonts would otherwise
	not support their needs. On the other hand, I admit that requiring
	a font rasterizer in every device would be a significant burden
	for some non-trivial set of devices, and, therefore, we can't
	mandate in all cases. At least we should provide support through
	the content for either inline font representations or for reference
	to out-of-line font representations. 

	        >As well as defining an XML schema, it might be useful to establish rules/behaviors that consumers would abide by if they come across

	        >information in a TT file/stream that they don't process 

	Indeed, the lack of default rules is a problem with the various implementations of DVB decoders :-(. 

	Glenn A. Adams wrote: 

	        >I do have some doubts about whether we should try to support...."data tunneling" mechanisms.... 

	Teletext magazine transmissions consist of timed text where different text instances are repeated at intevals (page refresh). In addition other features such as clock and fasttext require periodic transmission much the same as V-Chip ratings. TT might be a useful format for the definition and exchange of Teletext magazine content. 
	Are you saying you support adding data tunneling features to TT?
	If so, please attempt to justify this based on our charter. 

	Erik Hodge wrote: 

	        >The *wire format* I think is what we probably don't want to define for TT, but we certainly should think about it while designing the *file format*.

	IMHO a very low bandwidth *wire format* would be essential for the adoption of TT in the emission context of the subtitling and closed captioning arena. However I suspect that TT in this arena would predominate in the authoring and storage contexts not the emission context, not least because of the presence of existing standards. 

	I agree that we are not going to be replacing EIA 608 or 708 or
	other similar emission formats in other regions any time soon. On
	the other hand, I believe there is significant opportunity to
	directly use whatever format we establish in a variety of other
	distribution contexts, such as DVD and other types of newer
	media, whether stored or transmitted via ether or wire. 

	Neil smith wrote: 
	        >The possibility to indicate various formatting styles should be possible per line of captions - emphasis, alignment / positioning, 

	        >even colour may have their place 

	I think the 'access unit' should be considerably smaller than a 'line'. For some subtitling styles (snake and teletext add-on) the ability to time individual words is necessary. Further - the ability to change style (colour /font) on a word by word basis is sometimes needed. I personally would see **no** use for per character 'access units', but others may! Ideally the scope/size of the 'access unit' should not be defined. Snake and teletext add-on subtitling would imply a further requirement on TT - that of association of 'access units' into larger structures. 

	I think it is clear that we will need to support styling and
	perhaps even timing aspects at individual character level; however,
	I consider this somewhat different from access unit (AU). For me, access
	unit is the smallest granularity for seeking into a stream for the
	purpose of resynchronizing with sync masters or for performing REW, FF,
	etc. That doesn't mean that we couldn't also have finer grained
	timing of elements within an AU.  

	My current personal view is that TT should define a streamable file format consisting of self contained access units: 

	Each access unit should reference a preferably orthogonal timing element that supports at the minimum an on air time, optionally an off air time, where timing is either relative or absolute (relative timing would require the timing element to include a reference to the previous (and next access unit - to support trick play / reverse play)). The ability to group 'access units' together into a composite group is also desirable (e.g. words into lines, lines into subtitles). Display style should be external to the 'access unit' and the 'access unit' should allow the inclusion of a content definition (e.g. speaker, audio description...). A facility for defining additional supplementary information eg authors, creation dates etc should be provided. Guidelines for the streaming of the format should be developed. 

	Could you elaborate on how you see "on-air" vs. "off-air" time
	being expressed? If I may draw from MPEG terminology, in that context
	there are two kinds of timestamps: DTS (decoding time stamp) and
	PTS (presentation time stamp). They are separated in MPEG because
	it is necessary to stage decoding prior to presentation, and also
	because order of delivery and decoding of access units may be
	different than order of presentation of presentation units.
	An "access unit" is defined by MPEG-2 Systems (ISO 13818-1) as:
	"A coded representation of a presentation unit. In the case of audio,
	an access unit is the coded representation of an audio frame. In the
	case of video, an access unit includes all the coded data for a picture,
	and any stuffing that follows it, up to but not including the start of
	the next access unit..."

	In contrast, a "presentation unit" is defined as:
	"A decoded audio access unit or a decoded picture."

	I find these terms to be very useful in discussing streaming media,
	and I would think they can be simply extended to describe timed text
	data as well.


	        John Birch 

			The views and opinions expressed are the author's own and do not necessarily reflect the views and opinions of the Screen Subtitling Systems Limited.
Received on Friday, 31 January 2003 11:00:10 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 5 October 2017 18:23:58 UTC