W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > January to March 2003

Re: Audio formats

From: Al Gilman <asgilman@iamdigex.net>
Date: Sun, 16 Feb 2003 14:12:10 -0500
Message-Id: <5.1.0.14.2.20030216113338.0259dcb0@pop.iamdigex.net>
To: w3c-wai-ig@w3.org
Cc: "Webmaster@EDD" <web@edd.ca.gov>


>On Tuesday, Feb 11, 2003, at 03:26 Australia/Melbourne, Webmaster@EDD wrote:
>
>>I am wondering about the potential for what Chaals referred to as "dubious
>>benefit" to be gained by a deployment of textual content in a recorded audio
>>format.  I understand the benefit of real-time text-to-speech conversion
>>through the use of screen reader technology.  I believe content that is
>>tagged correctly probably provides the overall best solution to the problem
>>of accessibility for visually impaired end-users, but my customers are
>>asking for recorded audio.
>>
>>My customers believe there is value in providing recorded audio versions of
>>textual content.
>>
>>I would like to know whether or not recorded audio is something I should be
>>looking at as a viable way of improving accessibility of web content.
>>
>>I would like to know whether or not anyone is using recorded audio,
>>especially with the goal of improving accessibility for visually impaired
>>end-users... and I'd like to hear how it went/is going.
>>
>>Thanks
>>sb

Let's take that in reverse order:

>>I would like to know whether or not anyone is using recorded audio,
>>especially with the goal of improving accessibility for visually impaired
>>end-users... and I'd like to hear how it went/is going.

Absolutely.  This is a long-standing practice called talking books.  But
it goes on in parallel with the web.  The audio is distributed in audio
media ranging from cassette tape to CD ROM.

It is useful for print-impaired users more generally than just those with
visual disabilities.  Last I heard RFB&D had more dyslexic subscribers than
blind subscribers.

This is the community that formed the Daisy Consortium to bring their
practices into the digital age.  That group has been careful to frame their
standard as a system of publishing coordinated text and sound.  For the
present time the actual books distributed are just converting over from all
audio with very limited navigation to audio plus a navigation infrastructure
(hypertext table of contents, to the layman) that may just identify major
navigation destinations in the volume.

But for reference volumes, it is essential to provide the full text and
the navigation infrastructure to find an entry rapidly.  One is tempted
to think that your manuals are in this category.

This community looks forward to using the Internet as a distribution channel
*in the future.*

If the University system is contemplating publishing manuals and catalogs on
CD ROM they should definitely consider the talking book standard as the format
to use.  This will give you the 'find it' support to allow the superior audio
of a recording to be a plus and not a resented frivolity.

>>I would like to know whether or not recorded audio is something I should be
>>looking at as a viable way of improving accessibility of web content.

As Charles said, the answer to this question depends on the objectives of
the material that might be recorded.

But generally speaking, for people with visual disabilities, the answer is
probably 'no.' Better to take the money that might be used to pay the
readers to record the texts and pay an editor to review the labels on the
forms, for example.

For people with other print disabilities, who haven't already gone native in
the dialect of their screen reader voices, it could be a genuine plus, if
done right.

For the work you are doing for your customers, a "work process re-engineering"
approach is the right framework for deciding where to apply recorded audio and
where it is not worth the trouble.  In particular this involves looking at
what the students and employees with print disabilities will need to do with
the information so published, and how they can do those tasks in the context
of their laptop and screen reader.

>>My customers believe there is value in providing recorded audio versions of
>>textual content.

We appreciate that you are reluctant to sound insubordinate.  On the other
hand, when there is good news and bad news, you owe them both.  Don't make
it sound like you are telling them what to do.  But don't tell them less
than what you know.  By now you should appreciate that screen reader users
may well greet steps in this direction with hostility.  "Wasted
frivolity, a sop to the ego of the producing activity," or some such
reaction.  Because for someone who is accessing Web content with their
computer where they have installed and generally use a screen reader to
mediate their access to the applications on that computer, the substitution
of site-recorded audio for client-side-synthesized speech is likely to be
more nuisance than boon.  And it is unlikely to improve the go/no-go
accessibility of the content for these users one iota.

Partly it is a matter of the nature of the content.  If I recall correctly,
you first mentioned manuals and forms.  This sort of content biases the
expectations against the sound file gambit.  The Daisy book deals with
novels that you want to drop in and just listen to.  The only navigation
used may be simply to resume where you left off.  And it also deals with
dictionaries and encyclopedias where the text form dominates the value of
the access.  The important part of the task is finding the right entry to
read, not cruising the whole thing.

There are a lot of sample-and-decide cycles that are accomplished in visual
browsing entirely by the user random-accessing the display screen without
the computer being involved or aware of these decision process steps.  The
same "isolate the section I need" process in screen-reader-assisted browsing
use navigation steps closed through the computer: navigation commands that
alter the course of the display.  The absence of persistence in the audio
medium means that the user is making many more discrete navigation commands
(counting one pointing gesture with the mouse as one command) than in visual
interaction.

To the computer, the eyes-free user appears more active in the interaction,
although it is really because much of the action in the case of the visual
user is subliminal to the user and hidden from the computer.  Preserving the
functionality to micro-manage the locus of "you are here" is a precondition
for any improvements in audio verisimilitude being a gain and not a loss.
Naive attempts to turn pages into sound files without internal structure
would  be a setback in access to the information for the kind of information
that is in many web pages.

It sounds as though these manuals could be more like encyclopedias than 
novels.
In a given visit to the resource, the user will only want the full text of a
small part.  The name of the game is isolating the right part fast; not how
natural the voice is that reads the part you need.

Likewise, one possible transaction is cutting a course number from a course
catalog and pasting it in the list-box search function to select that course
in the registration form.

The number of potential courses is too many to listen through a list that
one could review effectively by eye; one will want to use something clipped
from the catalog in the text-entry option in a combo box or the
search-for-option function in a list box.

An audio recording of the reference volume is significantly inferior for
this trasaction, which is what the student needs the information in the
course catalog *for.*  The student would have to carefully record and
re-enter the course number which is bad news for both time to complete and
likelihood of error.

One of the factors that we haven't discussed is that the Web is behind the
CD ROM domain in dealing with such synchronized text and audio.  The W3C has
a Multi-modal Interaction Working Group working on how to express the binding
of the recorded sound into web pages in a way that works well.  But the Web
isn't there yet.  We lack the format smarts to blend the superior sound of a
recorded voice with the user-directed browse of the manual or user-originated
fill-out of the form at the right level of granularity and justInTime play.

Don't get your back up and lecture your customers.  But don't let them go ahead
down the road to an ugly surprise without trying to clue them in.

As Cheryl Trepanier said, "People are adapted."  The people who use screen
readers are adapted to the speech patterns of the synthesizers installed
with their screen reader.  The 'obvious' superiority that the lay person
perceives in the recorded voice over the synthesized voice is generally a
negligible difference in the accessibility of the information.  The dialect-
dependent poetry of Robert Burns or Dylan Thomas are more accessible in the
human-spoken word.  But forms and manuals?  Most likely not.

Your mileage may vary.  You may find a consumer group and an audio format
that yields access gains beyond what I have suggested here.  Just a review
of the best current practice in terms of applying recorded readers to
information access by people with print disabilities is somewhat cautionary
in terms of applicability to the Web.   Talking books are transcriptions of
things designed to be books.  The related media-melding technologies and
design standards for things designed to be websites have yet to be
developed, so far as I know.

Al

At 11:01 PM 2003-02-15, Charles McCathieNevile wrote:

>Without doing a serious assessment of your particular project it is 
>impossible to give an answer I am certain of.
>
>But in general, recording the text as audio is a poor way of improving 
>accessibility for people with visual impairments. Many will not get any 
>benefit at all, many will get minimal benefit. If this is done at the 
>expense of providing a good navigable structure (this can be done in 
>recorded audio but is difficult, and for general accessibility must be 
>available in a non-audio form), clear communication, or other important 
>facets of accessibility, then you will be doing an overall disservice to 
>people with visual impairments.
>
>You also put yourself at risk of doing a very significant disservice if 
>you rely on the recorded audio to convey any information, or if it is not 
>absolutely accurate as a representation of your content. On the other hand 
>you are likely to help people with significant intellectual disabilities, 
>if you have high-quality recorded audio available as they browse the content.
>
>Sites that do this include Mencap - http://www.mencap.org.uk - and Peepo - 
>http://www.peepo.com These sites are not directed at people with visual 
>impairment, but at people with intellectual disabilities.
>
>Can you explain more about why your customers believe there is value in 
>recorded audio? It may be that I am missing something particular to your case.
>
>cheers
>
>Chaals
>
>On Tuesday, Feb 11, 2003, at 03:26 Australia/Melbourne, Webmaster@EDD wrote:
>
>>I am wondering about the potential for what Chaals referred to as "dubious
>>benefit" to be gained by a deployment of textual content in a recorded audio
>>format.  I understand the benefit of real-time text-to-speech conversion
>>through the use of screen reader technology.  I believe content that is
>>tagged correctly probably provides the overall best solution to the problem
>>of accessibility for visually impaired end-users, but my customers are
>>asking for recorded audio.
>>
>>My customers believe there is value in providing recorded audio versions of
>>textual content.
>>
>>I would like to know whether or not recorded audio is something I should be
>>looking at as a viable way of improving accessibility of web content.
>>
>>I would like to know whether or not anyone is using recorded audio,
>>especially with the goal of improving accessibility for visually impaired
>>end-users... and I'd like to hear how it went/is going.
>>
>>Thanks
>>sb
>>
>>
>>
>>>Just using recorded audio and expecting people to listen to it is
>>>probably of dubious benefit - it often interferes with people's speech
>>>technology. Since people need their speech systems running to get as
>>>far as your pages, they are more likely to turn off your audio than
>>>theirs - so you would be doing a lot of expensive recording that your
>>>stated target audience aren't going to appreciate at all.
>>>
>>>Your advice on having decent structure seems to be more valuable in
>>>this case. I would suggest there is little point just recording the
>>>audio unless you have some expectation that the work will be done to
>>>use it in a more advanced audio format provided (and of course
>>>maintained) as an alternative version - a significant undertaking.
>>>
>>>cheers
>>>
>>>Chaals
>>>
>>>On Saturday, Feb 8, 2003, at 05:39 Australia/Melbourne, Madeleine
>>>Rothberg wrote:
>>>
>>>>
>>>>It sounds like you are considering producing audio books. You may be
>>>>interested in the Digital Talking Book specification, which provides
>>>>a way to mark up an audio book to have navigation within it. If the
>>>>audio is combined with the full text of the book, then you have full
>>>>text searching as well as audio playback.
>>>>
>>>>More info from DAISY at:
>>>>http://www.daisy.org
>>>>
>>>>-Madeleine
>>>>
>>>>--
>>>>Madeleine Rothberg
>>>>The CPB/WGBH National Center for Accessible Media
>>>>madeleine_rothberg@wgbh.org
>>>>http://ncam.wgbh.org
>>>>(617) 300-2492
>>>>
>>>>On Friday, February 7, 2003 1:04 PM, Webmaster@EDD <web@edd.ca.gov>
>>>>wrote:
>>>>>My department is working on ways to increase accessibility of our web
>>>>>content.  My advice has stressed the importance of document
>>>>>formatting and
>>>>>tagging that will ensure navigability/usability in conjunction with
>>>>>screen
>>>>>reader browsing software.  I never considered audio files to be a
>>>>>particularly effective format for improving accessibility of content
>>>>>for the
>>>>>visually impaired user.
>>>>>
>>>>>One program are would like to deploy audio versions of their
>>>>>departmental
>>>>>forms and manuals (some of which are 50+ pages in length), with the
>>>>>rationale that visually impaired users can then "listen" to the
>>>>>forms.  I
>>>>>don't consider this to be an effective use of audio technology,
>>>>>however I
>>>>>have also never seen it used in that way.
>>>--
>>>Charles McCathieNevile           charles@sidar.org
>>>Fundación SIDAR                       http://www.sidar.org
>--
>Charles McCathieNevile           charles@sidar.org
>Fundación SIDAR                       http://www.sidar.org
Received on Sunday, 16 February 2003 14:12:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 19 July 2011 18:14:08 GMT