Would altsound attribute be useful?

Thinking about a recent article, I began to wonder whether there ought
to be a mechanism to allow an actual voice recording to be used instead
of text to speech conversion to provide audio equivalents for a web 
page.

There is already an alt attribute for providing alternative text for
purely visual elements, so I wondered whether there ought to be an
altsound attribute to provide an audio equivalent for visual and
textual elements.

Because audio and visual channels can be used in parallel, this could
be used to augment a basically visual model, which might provide benefits
for non-literate users.  This would require that browsers coupled some
visual pointing convention with the screen reading function.

There are technical problems in providing digitised speech, basically to
do with the large bandwidth required, however, used in an accessible way,
pure visual access or text to speech could be selected as alternatives by
the user.

Of course, such a mechanism may well be more used for non-accessibility
reasons, and such abuse could well cause a drift in browser capability
away from accessibility objectives.  On the other hand, without the
mass market mis-applications, there may be no incentive to implement
the features.

The other technical issue is integration with visual pointing and providing
cue and review capabilities.

Given the precedent of alt and longdesc (not that there is much precedent
in the latter) I think this ought to be an HTML, rather than style sheet,
attribute (whereas background music really ought to be in style sheets,
and this mechanism would provide a way to violate that principle).

One could argue that longdesc could be overloaded with this function,
but I think the semantics are somewhat different:  longdesc is an on
demand function.  To do this overloading, the browser would have to
send an Accept header with a higher priority for audio than for HTML.
Actually, if longdesc is implemented sensibly, one could already respond
with audio (noting that browsers generally don't allow users to tune
content negotiating quality values).

The problem here is that very few web authors are aware of content
negotiation (it requires configuring something other than HTML) and
many people on small budgets (charities, clubs, etc.) don't feel they
can afford proper access to a web server (remember the recent Hull
site that appeared to use a cheap service).

CSS2 has style attributes for providing backgrounds and pre- and post-
sounds for an element.  The pre- and post- sounds imply linear progression,
but:

body {play-during: url(.....);}

would seem to make sense as a standards conforming alternative to bgsound,
even though the attributes are described in terms of only being used for
aural browser, and I doubt that any current graphical browser implements
them (I can't get a string match in the Mozilla 0.8.1 sources).  I suspect
most mainstream browser designers skipped the aural style sheets section!

By coupling visual pointing to sound context, as suggested above, this
attribute could be applied on other elements.  Note that I don't think
this is a job for author control (except for proof of concept); the 
pointing convention should be standard for the browser, not learned for
each page - in particular mouseover triggering should be under user or
browser control, not implemented by the author.

With such local use, such attributes could be used as a non-verbal signal
to back up text, but good accessiblity would still require text or recorded
speech, and I think that that should be identified in the main document,
not the style sheet.

I seem to remember that phonetic spellings are included in later style
sheet standards, and that might constitute a counter argument to the
argument that the spoken sound should be referenced from the main
document.

Received on Tuesday, 1 May 2001 20:14:03 UTC