RE: TT and subtitling from Glenn A. Adams on 2003-01-31 (public-tt@w3.org from January 2003)

From: Glenn A. Adams <glenn@xfsi.com>
Date: Fri, 31 Jan 2003 13:37:11 -0500
To: <Johnb@screen.subtitling.com>
Cc: <public-tt@w3.org>
Message-ID: <7249D02C4D2DFD4D80F2E040E8CAF37C01FA9F@longxuyen.xfsi.com>

Let me back up a bit, I guess I am having trouble understanding
exactly what you mean by "on-air" vs "off-air" times. So, perhaps
you can educate me a bit about this distinction.

As for access unit vs presentation unit, I would tend to use the
former when talking about the coded representation and its delivery
and buffering modes, and use the latter term for talking about its
decoded to-be or currently presented modes.

Regarding DTS vs PTS, I would probably expect the DTS to be implied
in the context of TT, and that only an equivalent of PTS be specified
or otherwise computable. I can see an implementation of a streaming
TT decoder might want to dynamically compute a DTS based upon its
ability to decode and compute a presentation unit from an access unit.
For example, some access unit might embed font outline data that
requires rasterization. Different decoders may have different
performance profiles with regard to their ability to perform this
rasterization.

-----Original Message-----
From: Johnb@screen.subtitling.com [mailto:Johnb@screen.subtitling.com]
Sent: Friday, January 31, 2003 1:07 PM
To: Glenn A. Adams
Cc: public-tt@w3.org
Subject: RE: TT and subtitling

I wrote:
My current personal view is that TT should define a streamable file format consisting of self contained access units:

Each access unit should reference a preferably orthogonal timing element that supports at the minimum an on air time, optionally an off air time, where timing is either relative or absolute (relative timing would require the timing element to include a reference to the previous (and next access unit - to support trick play / reverse play)). The ability to group 'access units' together into a composite group is also desirable (e.g. words into lines, lines into subtitles). Display style should be external to the 'access unit' and the 'access unit' should allow the inclusion of a content definition (e.g. speaker, audio description...). A facility for defining additional supplementary information eg authors, creation dates etc should be provided. Guidelines for the streaming of the format should be developed.

Glenn A. Adams wrote:

Could you elaborate on how you see "on-air" vs. "off-air" time
being expressed?

Both On-air and Off-air timings would have the characteristic of being **both** expressed as relative timings to previous and next 'presentation units' (see later comment!) - this would be necessary if bi-directional playout was required in the authoring context. Alternatively, both On-air and Off-air timings could be expressed as absolute timings (i.e. wrt another timebase within or external to the 'stream'. I see this as an attribute of the timing for a presentation unit (absolute or relative timing). The off-air timing could be optional - in which case the 'presentation unit' remains until replaced. There is I feel a need to support multiple compositions of 'presentation units' - since a TT display might consist of several 'regions' (and I don't intend to imply location here) of text that might be independently derived from the screen - some of which are using over-writing and some are self timed etc. Sorry if this is a bit vague.....

If I may draw from MPEG terminology, in that context
there are two kinds of timestamps: DTS (decoding time stamp) and
PTS (presentation time stamp). They are separated in MPEG because
it is necessary to stage decoding prior to presentation, and also
because order of delivery and decoding of access units may be
different than order of presentation of presentation units.

I am not aware of any use of DTS within DVB subtitling.

An "access unit" is defined by MPEG-2 Systems (ISO 13818-1) as:

"A coded representation of a presentation unit. In the case of audio,
an access unit is the coded representation of an audio frame. In the
case of video, an access unit includes all the coded data for a picture,
and any stuffing that follows it, up to but not including the start of
the next access unit..."

In contrast, a "presentation unit" is defined as:

"A decoded audio access unit or a decoded picture."

I find these terms to be very useful in discussing streaming media,
and I would think they can be simply extended to describe timed text
data as well.

So in effect an 'Access Unit' is a coded 'Presentation unit'. In my previous paragraph - my intended meaning would perhaps be clearer if I had used the term 'presentation unit'?

regards

John Birch

The views and opinions expressed are the author's own and do not necessarily reflect the views and opinions of the Screen Subtitling Systems Limited.

Received on Friday, 31 January 2003 13:37:14 UTC