- From: Markku T. Hakkinen <hakkinen@dinf.ne.jp>
- Date: Fri, 01 Nov 2002 08:25:27 +0900
- To: www-tt-tf@w3.org
Dear Thierry Michel, At the Japanese Society for Rehabilitation of Persons with Disabilities (JSRPD), a member of both W3C and the Daisy Consortium, we are producing content using the Daisy 2.02 recommendation, which incorporates SMIL to synchronize text and audio for Digital Talking Books. At present, timed text is an important technology for us as we look to incorporate multimedia and animation in Daisy-style publications for mainstream applications such as public emergency information systems. This author was a member of the SMIL 1.0 Working Group, and was involved in the selection of SMIL 1.0 for text audio synchronization in the Daisy standard. The support within SMIL 1.0 for synchronization of multiple media types was important to us, but the lack of a common way to implement text synchronization/timed text, led Daisy to create its own methods within the framework of SMIL. In our Daisy applications, which we consider a timed text implementation, synchronization events reference a segment in an audio file (e.g., MP3) and the corresponding text, contained within a structural element in a source document (which may be a book, script/transcript, lyrics, plays, etc.). Daisy playback systems which display both text and audio, generally utilize DOM to retrieve the text referenced by the SMIL text element (fileref#ID). The playback system has the choice of highlighting the text element in context (in a browser- style display), or utilizing either the plain or rich text contained in the element in a variety of ways, including presentation in a separate window (large typeface/high contrast display), presentation on a refreshable Braille device, or rendering via alternate media such as synthetic speech. The JSRPD sponsored, Daisy-based AMIS (Adaptive Multimedia Information System) is planning to add video presentation capabilities in the near future and we are interested in seeing a standardized timed-text solution from W3C. The goal with AMIS is to be able to present rich, multi-modality access to accessible multimedia content. This will include synchronized full motion video with captioning extracted from a full text transcript. Caption text can be viewed either in context of the full transcript (effectively a Daisy book with video), in an overlayed caption window, in a separate window or physical display using large typefaces, or via a refreshable Braille display. Playback in all modes is fully synchronized, and playback controls can be used to pause presentation and review text, especially important in the Braille modality. We also expect to add timescale modification to slow down presentations, particularly important for persons with learning disabilities, or those unfamiliar with the presentation language. Though we believe our present Daisy-based model can serve as an effective base for our multimedia development, our ideal goal is to see our SMIL-based content supported on multiple players and platforms. The way to achieve this is to continue to base our work on W3C open standards. Where this goal falls short is timed text. We strongly support development of a W3C timed text recommendation. At present, we see an incompatible mix of implementations for text captioning, some using SMIL, some not, in the commercial players. Ideally, we'd like to be able to author multimedia with synchronized text in one format, and not have to concern ourselves with vendor specific implementations. A single, open standard supported by multiple vendors would be the best choice. In view of our experience with Daisy, and our interest in a timed text standard, JSRPD would plan to participate actively in the TTWG. Markku Hakkinen Research and Technology Advisor Japanese Society for Rehabilitation of Persons with Disabilities hakkinen@dinf.ne.jp
Received on Thursday, 31 October 2002 18:26:38 UTC