Format Requirements for Text Audio Descriptions (was Re: HTML5 TF from my team)

On Tue, Apr 27, 2010 at 4:52 AM, John Foliot <jfoliot@stanford.edu> wrote:
> Hironobu Takagi wrote:
>>
>> Finally, one of my team members will officially
>> join the HTML5 Accessibility Task Force.
>> Masatomo has been working for the audio description
>> project, especially for the experiments in Japan
>> and in US with WGBH.
>> He can be the bridge between the open source authoring
>> tool work and aiBrowser on Eclipse.org.
>> He is now checking resources on the Web.
>> If you have any suggestion for our involvement (beyond
>> the mailing-list), please let us know.
>> We are looking forward to working with you.
>
>
> Hiro, this is great news! Hello and welcome to Masatomo. (For those
> unaware or do not remember, Hiro presented IBM Research – Tokyo's work on
> descriptive audio using synthesized voice at the Face-to-Face here at
> Stanford last November, as well he and his team were at the CSUN
> conference in March in San Diego. It is - IMHO - wicked cool! A Word Doc
> can be found here:
> http://www.letsgoexpo.com/utilities/File/viewfile.cfm?LCID=4091&eID=800002
> 18 and perhaps Hiro you could point us to web-based [HTML] resources too?)
>
>
>
> Masatomo, you might want to start by reviewing the draft specifications
> that are currently under discussion:
>
>        http://www.w3.org/WAI/PF/HTML/wiki/Media_MultitrackAPI
> and
>        http://www.w3.org/WAI/PF/HTML/wiki/Media_TextAssociations.
>
>
> Silvia Pfeiffer recently wrote a blog post that is more easily readable as
> an introduction, but as she notes it is not as technically accurate. It
> can be found at
> http://blog.gingertech.net/2010/04/11/introducing-media-accessibilit-into-
> html5-media/.
>
> There has also been a fair bit of discussion recently about choosing one
> or more appropriate time-stamp formats to be referenced in the HTML5
> Specification/Standard - this discussion is very much up in the air at
> this time.
>
>
> As well, while not officially 'W3C', the WHATWG has started collecting
> examples of time-aligned text displays (captions, subtitles, chapter
> markers etc) and is extrapolating requirements from these in their wiki
> at:
>
>        http://wiki.whatwg.org/wiki/Timed_tracks
> http://wiki.whatwg.org/wiki/Use_cases_for_timed_tracks_rendered_over_video
> _by_the_UA
>
> http://wiki.whatwg.org/wiki/Use_cases_for_API-level_access_to_timed_tracks
>
>
> (I believe screen captures, etc. of your work with descriptive text would
> be relevant here!)
>


Let me chime in here, since it is right now particularly relevant to
you with textual audio descriptions.

You will find at http://wiki.whatwg.org/wiki/Timed_tracks several
mentions of "text audio descriptions".

The assumption of that page is that textual audio descriptions do not
require more than the following information in a format:
* start time
* end time
* text
* possibly the voice to chose to read back

Are there any other requirements that you have come across in your
work with textual audio descriptions? What do the files that you are
using as input to your speech synthesis system for audio descriptions
look like? Do they have any special fields that would need to be taken
care of in a standardised storage format for textual audio
descriptions?

Cheers,
Silvia.

Received on Monday, 26 April 2010 22:42:41 UTC