TT Content Buffering and Timing Scenarios

I would like to engender a discussion on the issue of the contexts
where timed text (TT) content may be used and how this will affect the
buffering and timing aspects of this content. If this was discussed in
detail in the task force, please let me know. Otherwise, perhaps we can
discuss it a bit here, since I see some need for clarity (at least in
my own mind) regarding the various high-level usage scenarios.

I envision TT being used in a number of different contexts, including:

* authoring systems, for purpose of having a common interchange format
  amongst these systems;

* storage systems, for purpose of having a common static media format,
  e.g., on DVD, VHS, CD, etc.;

* traditional distribution systems, such as those that deliver V/A/D
  (video + audio + data) content or A/D (audio + data) from content
  sources to terrestrial or cable or satellite station operators;

* traditional emission systems, such as from terrestrial, cable,
  satellite operators to end-user receivers and terminal devices;

* online distribution systems, such as HTTP or RTP over Internet;

Among these various contexts, there appear to be a number of distinct
buffering and timing models:

(1) NON-STREAMING, SYNCHRONOUS: timing information is explicit, and not
    (necessarily) intended to be synchronized with any other media;

    Example: RealText, QuickText, etc., where it is the only media
    track or is asynchronous with respect to other tracks, and where
    it is pre-delivered as a single resource.

(2) NON-STREAMING, SYNCHRONIZED: timing information is explicit, and is
    explicitly linked to a time base in another media;

    Example: RealText, QuickText, etc., where it is the only media
    track or is asynchronous with respect to other tracks, and where
    it is pre-delivered as a single resource.

(3) STREAMING, SYNCHRONOUS: timing information is explicit, and not
    (necessarily) intended to be synchronized with any other media;

    Example: use of an MPEG PES (packetized elementary stream) with
    presentation time stamps to deliver synchronized data, where each
    data access unit is associated with a presentation time stamp
    locked to a program clock reference carried in that stream.

(4) STREAMING, SYNCHRONIZED: timing information is explicit, and is
    explicitly linked to a time base in another media;

    Example: use of an MPEG PES (packetized elementary stream) with
    presentation time stamps to deliver synchronized data, where each
    data access unit is associated with a presentation time stamp
    locked to a program clock reference carried elsewhere.

(5) STREAMING, ISOCHRONOUS: timing information is implicit,
    and is determined by the transport or the envelope;

    Example: use of SMPTE 292M with ancillary data to carry closed
    captioning.

    Example: use of NTSC/PAL VBI to carry EIA-608 or MPEG-2 Video
    Elementary Stream User Data to carry EIA-708.

So, I have some questions:

1. Is the above characterization of streaming vs. non-streaming and
   synchronous vs. synchronized vs. isochronous a correct and complete
   characterization of the various usage contexts? [Note that I am
   only looking at buffering and timing aspects, and not others
   like semantics and style.]

2. Can (and should) the models described above be simplified and/or
   generalized?

3. Given that we are committed to defining an XML based syntax for
   TT, how should we deal with the need to stream TT content? Are there
   any precedents in W3C specs for streaming XML content? (e.g., does
   XMLP or SOAP address these issues?) I am aware of the BiM format
   used with MPEG-7; however, there may be essential IPR claims from
   Expway on this format. Should we consider this or attempt to design
   something like it?

Well, I hope this will spark a useful discussion, at least one that I
can learn from.

Regards,
Glenn

Received on Wednesday, 29 January 2003 19:55:40 UTC