RE: Comments from JSRPD

 Dear Thierry Michel,

At the Japanese Society for Rehabilitation of Persons with Disabilities
(JSRPD),
a member of both W3C and the Daisy Consortium,  we are producing content
using the Daisy 2.02
recommendation, which incorporates SMIL to synchronize text and audio for
Digital Talking Books. At present, timed text is an important technology for
us
as we look to incorporate multimedia and animation in Daisy-style
publications
for mainstream applications such as public emergency information systems.

This author was a member of the SMIL 1.0 Working Group, and was involved in
the
selection of SMIL 1.0 for text audio synchronization in the Daisy standard.
The
support within SMIL 1.0 for synchronization of multiple media types was
important to us, but the lack of a common way to implement text
synchronization/timed text, led Daisy to create its own methods within the
framework of SMIL.

In our Daisy applications, which we consider a timed text implementation,
synchronization
events reference a segment in an audio file (e.g., MP3) and the
corresponding text,
contained within a structural element in a source document (which may be a
book,
script/transcript, lyrics, plays, etc.).

Daisy playback systems which display both text and audio, generally utilize
DOM to
retrieve the text referenced by the SMIL text element (fileref#ID). The
playback
system has the choice of highlighting the text element in context (in a
browser-
style display), or utilizing either the plain or rich text contained in the
element in a variety of ways, including presentation in a separate window
(large
typeface/high contrast display), presentation on a refreshable Braille
device,
or rendering via alternate media such as synthetic speech.

The JSRPD sponsored, Daisy-based AMIS (Adaptive Multimedia Information
System)
is planning to add video presentation capabilities in the near future and we
are
interested in seeing a standardized timed-text solution from W3C. The goal
with
AMIS is to be able to present rich, multi-modality access to accessible
multimedia content. This will include synchronized full motion video with
captioning extracted from a full text transcript. Caption text can be viewed
either in context of the full transcript (effectively a Daisy book with
video),
in an overlayed caption window, in a separate window or physical display
using
large typefaces, or via a refreshable Braille display. Playback in all modes
is
fully synchronized, and playback controls can be used to pause presentation
and
review text, especially important in the Braille modality. We also expect to
add
timescale modification to slow down presentations, particularly important
for
persons with learning disabilities, or those unfamiliar with the
presentation
language.

Though we believe our present Daisy-based model can serve as an effective
base
for our multimedia development, our ideal goal is to see our SMIL-based
content
supported on multiple players and platforms. The way to achieve this is to
continue to base our work on W3C open standards. Where this goal falls short
is
timed text.

We strongly support development of a W3C timed text recommendation.  At
present,
we see an incompatible mix of implementations for text captioning, some
using
SMIL, some not, in the commercial players. Ideally, we'd like to be able to
author multimedia with synchronized text in one format, and not have to
concern
ourselves with vendor specific implementations. A single, open standard
supported by multiple vendors would be the best choice.

In view of our experience with Daisy, and our interest in a timed text
standard,
JSRPD would plan to participate actively in the TTWG.

Markku Hakkinen
Research and Technology Advisor
Japanese Society for Rehabilitation of Persons with Disabilities
hakkinen@dinf.ne.jp

Received on Thursday, 31 October 2002 18:26:38 UTC