- From: Michael A. Dolan <miked@tbt.com>
- Date: Fri, 09 Aug 2002 09:54:31 -0700
- To: www-tt-tf@w3.org
I am admittedly not a SMIL expert, so if I have made assumptions here about SMIL that are not accurate, someone please correct them. First, it is my understanding that SMIL is intended to be used to construct a multimedia *presentation* and to control the timing of that presentation within the user agent. This example provided by Mr. Ramirez is a fine representation of what SMIL does well. But what I thought TT was (and the problem I thought was described in the requirements) is a language that allows the definition and authoring of a text stream synchronized to some timebase (either internal or external). TT is not a presentation system. TT is an authoring specification to which one could create text with sufficient synchronization elements as to sometime later on in time and space, either present it alone, or combine it with other related essence elements for presentation. Thus, it is my view that a TT file is an input to a SMIL presentation system, not SMIL itself. One cannot presume that the other related essence elements will be distributed along with the TT file. It is likely in some cases that they will not. If party A creates a video/audio presentation with the video and audio authored separately, then party B can create another audio track and combine later in time with the original video stream to retarget the presentation to another language for example. The same is needed of TT. A 3rd party must be able to author text related to some timeline to be combined with the related essence elements at some point in the future, possibly even through a separate distribution channel, arriving at the user agent separately. This scenario, common in television captioning for example, requires that the timeline be embedded in the TT element. In the case where there is only one author for everything, and everything is neatly bundled up into a single package for all time, then SMIL could be used for this purpose. But this is not the general case and I can't see how SMIL supports this looser composition of the elements. (Or if it can, could someone elaborate?) Further, SMIL seems to presume that the text essence of the composition is inline. This is like requiring all the image pixels be defined in the SMIL syntax rather than being able to refer to an external file. For example, I would have expected to see SMIL syntax of the form: <t:img..... <t:audio..... <t:text..... Where the "text" element of the composition is, in fact, the TT language syntax being contemplated by this group. Maybe one can construct a series of static (HTML) text files using the above, but that is clumsy and requires potentially hundreds of separate text files for a modest length presentation. The same problem is true of the images. SMIL is OK for short presentations perhaps, but not 2-hour long ones with a new text and image presentation every 4 seconds. A lengthy presentation (such as 2-3 hours) using SMIL would require thousands of separate files, and tens of thousands of lines of SMIL code. The former could be fixed by compositing text in a single file and using MNG with fragment URI syntax or something I suppose, but is a general problem and not specific to text. And, the architectural requirements that there be large amounts of SMIL code to perform only synchronization seems problematic. A 2-3 hours presentation seems to require enormous amounts of SMIL code. Other systems solve this with implicit synchronization using the timelines in the elements themselves. SMIL seems to discourage the use of timelines in the essence files, preferring to set the timebase itself. SMIL 1, as I recall, could not handle push video and audio streams for this reason (is this better in SMIL2?). Minimally, it is still not obvious how to composite multiple streams each with their own timelines. In contrast, this is common practice in all existing video and audio authoring systems. That is, given a video stream with a timeline and an audio stream with a timeline other systems synchronize these implicitly as a matter of course without explicit controls for every frame. The same is needed for TT. It needs its own timeline and the presentation system needs to be able to make sense of it relative to the other components. So in summary, there are several issues: 1. TT needs to be a peer authoring format to video/image/audio and not embedded in the presentation language; 2. TT needs its own timeline to allow 3rd party authoring and simpler compositing; and 3. A presentation system is presumed that can composite these separate elements (which may or may not be SMIL). Can some of the SMIL XML syntax be re-purposed for defining the TT language? Seems to me that it can. Is SMIL and its semantics the answer to the TT problem? I sure don't see how. But perhaps SMIL2 is richer than I understand, and given the above discussion, perhaps someone more knowledgeable can construct an example using SMIL that meets the needs described here and can show how it scales to 3 hours? Regards, Mike At 07:50 AM 8/9/2002 -0700, Jose Ramirez wrote: >Hi All, > >It's a little to quiet here, this should change that :) > >A short piece, demonstrating how well timed text is handled in the >HTML+SMIL profile, preloads about 1MB and 1:30.00s long (IE 6 required). > >http://www.geocities.com/ramirez_j2001/test3/poem/html_smil_example.html > >Hopefully a simple Timed-text profile that could fit well with >the SMIL 2 profile player could be created. > >Features that are quite useful: >-begin and end attribute >-fade transition ( as the above example shows, fading the text allow the > text to blend with a presentation, otherwise the text would just jump > onto the screen and be distraction) >-transparent background >-some HTML elements, p, h1 h2..., br >-text align, left, right, center (like in the above example, the text > didn't need to have a exact x y position and align center provided an > easy solution) >-absolute x y positioning > >The most important aspect I think is to keep version 1 as basic as >possible, so it can be implement soon and finally there can be >multimedia document made with non-proprietary components. > > >Jose Ramirez >proprietary = temporary > > > > > > > > > > > ----------------------------------------------------- Michael A. Dolan TerraByte Technology (619)445-9070 PO Box 1673 Alpine, CA 91903 USA FAX: (208)545-6564 URL:http://www.tbt.com
Received on Friday, 9 August 2002 13:02:20 UTC