Re: Format Requirements for Text Audio Descriptions (was Re: HTML5 TF from my team) from Masatomo Kobayashi on 2010-05-06 (public-html-a11y@w3.org from May 2010)

From: Masatomo Kobayashi <MSTM@jp.ibm.com>
Date: Fri, 7 May 2010 01:37:18 +0900
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Cc: Hironobu Takagi <TAKAGIH@jp.ibm.com>, John Foliot <jfoliot@stanford.edu>, Geoff Freed <geoff_freed@wgbh.org>, public-html-a11y@w3.org
Message-ID: <OF27D6F125.BEABA30D-ON4925771B.00204D2F-4925771B.005B5162@jp.ibm.com>

My comments on extended captions and Speech CSS are inline below.

Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote on 2010/05/05 09:11:08:

> Thinking about it in more depth, we may even want to use such an
> attribute on captions and subtitles. It would indicate what will
> happen if caption elements overlap into the next caption text cue, ie.
> just display both (which would be the default) or clip the cue.
> Pausing the video probably doesn't make sense for caption text.

Oops, I have never deeply thought about captions.
I agree that we could use that attribute to handle overlapping captions.

I think "extended caption" in Geoff's comment is also interesting.
In contrast to extended audio descriptions, a boolean flag will be needed 
for each "extended" caption element?
The duration of a caption must be explicitly specified by the author (so 
we cannot set the same begin/end time to indicate it is "extended") while 
that of an audio description is actually determined by the TTS engine.

> That would be one way to support it. Do you know if Web browsers
> support SSML natively?
> 
> Also, there is Speech CSS (see http://www.w3.org/TR/css3-speech/),
> which seems to provide for the same functionality. Have you
> experimented with Speech CSS? Do you know if TTS engines support it?

According to documents, Opera supports Speech CSS and a small part of 
SSML.
Also Fire Vox provides support for Speech CSS.
But unfortunately they did not work well on my PC, so I have not actually 
used those features.

If Seeech CSS is chosen, the problem will be with which format (instead of 
srt) to use to mark up the external text resource to be described by the 
CSS.

We might need to check SSML/Speech CSS features of Web browsers, screen 
readers, and TTS engines to explore the possibility of rich textual audio 
descriptions.

Regards,
Masatomo

Received on Thursday, 6 May 2010 16:37:56 UTC