Re: [CSS21] WAI Issue 1: Relegation of Aural CSS to an informative appendix & the Deprecation of the aural media type [DRAFT]

raman - thanks for the perspective and background on ACSS and the aural
versus the speech media type -- do you think that if aural CSS was 
capable of controlling all aural events, from onLoad sounds to other 
interactive or passive audio, there would be more buy-in by developers?

that's what i really would like to establish as the standard -- control 
over aural events through a stylesheet which also provides a normative 
means of adjusting speech properties, which would not only help with 
accessibility but with internationalization -- one need look no further 
than charles chen's explanation of why he developed CLiCk, Speak
( -- for the benifit of a user who can 
understand a natural language when spoken, but who cannot decipher the 
glyphs used to visually represent that language or when the glyphs 
aren't supported by the machine the user is using (predicated, of course,
that the proper markup has been used to indicate a natural language 
switch) but a speech-engine is...

personally, i'd like to use an aural stylesheet to provide a "verbose", 
"terse" or "earconic" (audio cues only) aural canvas, and can conceive 
of authors and site managers being attracted to overlaying an 
aural canvas appropriately themed to the season (and what the site is 
attempting to sell) by changing or switching a single stylesheet to 
overlay an aural template -- such as "back to school" "sun and fun at 
the beach" "winter wonderland" etc. -- for the whole site...

but the most important thing would be enabling user control over 
aural events, and the use of the native accessibility API and 
operating system's user preferences, that a user who cannot hear
can receive an appropriate equivalent alert of a type with which the 
user is used to interacting, such as "Show Sounds" or "Sound Sentry"

PEDESTRIAN, n. The variable (and audible) part of the roadway for
an automobile.         --  Ambrose Bierce, The Devil's Dictionary
Gregory J. Rosmaita: and
UBATS: United Blind Advocates for Talking Signs:

---------- Original Message -----------
From: "T.V Raman" <>
Sent: Tue, 11 Dec 2007 09:36:12 -0800
Subject: [CSS21] WAI Issue 1: Relegation of Aural CSS to an informative 
appendix & the Deprecation of the aural media type [DRAFT]

> Gregory, Here is some "historical" perspective on the
> speech/aural split  -- this is mostly from memory.
> Sometime in the 2003 timeframe, Dave Raggett and I were looking
> to synchronize SSML and Aural CSS in the following sense:
> Rendering rules expressed via Aural CSS when applied to XML
> markup should be able to produce SSML that delivers the desired
> aural presentation.
> In going through that exercise, we hit a number of
> discrepancies, most of which came down to "SSML is mostly about
> speech" whereas Aural CSS  dealt with much more than speech.
> Also, given the lack of implementation of Aural CSS within
> browsers, and given that to an extent Aural CSS had been
> dismissed by mainstream browsers as "that's for speech output, we
> dont do that",
> we felt that it was worthwhile splitting Aural CSS into two
> modules, speech and aural, where @media speech sould be aligned
> fully with SSML.
> To what extent the current drafts reflect that desire is
> something I've not had the time to check.
> Gregory J. Rosmaita writes:
>  > 
>  > [Reviewer's Note: this post refers to the Candidate 
> Recommendation draft  > of CSS 2.1, > 
> > comments upon 
> which are due by 20 December 2007] >  > Given the following use case:
>  > 
>  > Aural rendering is used to provide supplemental contextual 
> and semantic  > markers for an individual with either limited 
> vision, or a limited  > view-port, such as that obtained by 
> using a screen-magnifier application,  > which displays strings 
> of text in isolated viewports, with earcons  > (purely  > aural 
> cues) set to "on", but without speech output.  Such a user uses  
> > aural cues, provided by such extant mechanisms as: >  > 
> > 
> > 
> >  > to 
> supplement that user's constrained point of view.  Note that 
> this use  > case includes those who fall under the purview of 
> such organizations as  > Recording for the Blind and Dyslexic 
> ( >  > Note that some users will benefit 
> from viewing portions of the screen  > using a screen-magnifier 
> and aural cues; but that there are also those  > who not only 
> need isolated portions of the visual canvas rendered for  > them,
>   > but whose understanding and ability to interact with the 
> document  > benefits  > greatly from supplemental synthesized speech;
>  > 
>  > How, then, can speech be seperated from audio?  The Style WG 
> should be  > wary of the seperation of speech and pure aural 
> rendering rules, as  > there is one modality being addressed: 
> the aural canvas, whether that  > includes speech-synthesis or 
> purely earconic sounds. >  > The question, therefore, is this: 
>  What is the point of changing the  > media type from aural to 
> speech?  Speech synthesizers are aural  > renderers,  > but they 
> rely on a third party application (optimally, a DOM-aware user 
>  > agent) in order to obtain the content, flow, etc. of the 
> speech-output.   > If a user agent supports speech, as does 
> FireVox, it also needs to  > support  > the purely aural 
> (earconic) portions of the media rule; speech  > synthesizers 
> are not user agents, they are more akin to browser helper  > 
> objects (BHO) than they are to user agents per se. >  >  > SUMMATION:
>  > 
>  > The deprecation of the aural media type in favor of the 
> speech  > media type, is unacceptable, as there are valid use 
> cases where an  > individual benefits from supplemental earcons 
> that sound while  > viewing the visual canvas through a screen-
> magnifier type view-port,  > without speech output, but with 
> support for a pure audio  > (non-speech) overlay; likewise,
>  there is the use case of an  > individual who benefits from 
> supplemental speech, as well as a  > limited viewport and aural 
> orientational and contextual cues.   >  > Why is it necessary 
> for Aural CSS2.1 to remain normative?  The  > aural cascade will 
> enable an author to offer visitors is a choice  > between 
> "verbose" "terse" and "earconic" overlays. SSML may be  > where 
> the money and resources are currently devoted, but Aural CSS  > 
> is far superior for speech-output dependent computer users (that 
>  > is, the average end user) because things aren't hard coded, 
> but  > are subject to user over-rides. It's obviously a lot 
> easier to  > wizardize a "modify this site's aural styling", 
> which would allow  > the end user the final say over what is 
> spoken and how, than to  > edit an SSML document's document source.
>  > 
>  > An added benefit of retaining the purely aural portions of 
> ACSS  > is that, if both speech and purely aural styling are 
> addressed  > in the same stylesheet, it reduces the burden on 
> the author,  > allows for end-user override, and it increases 
> the probability  > of the implementation of both forms of 
> painting to the aural  > canvas. >  >  > PROPOSED RESOLUTION: >  
> > 1. The PF WG requests that the editors and Working Group de-
> deprecate the  >    "aural" media type and deprecate the 
> "speech" media type >  > 2. The PF WG requests that Appendix A 
> be renamed to Chapter/Section 19  > and  >    made normative > 
>  >  >  > --------------------------------------------------------
> -------- > CONSERVATIVE, n.  A statesman who is enamored of 
> existing evils, > as distinguished from the Liberal, who wishes 
> to replace them  > with others.         -- Ambrose Bierce, _The 
> Devil's Dictionary_ > -------------------------------------------
> --------------------- >              Gregory J. Rosmaita,
>  >   Camera Obscura:
>  > --------------------------------------------------------------
> -- >
> -- 
> Best Regards,
> --raman
> Title:  Research Scientist      
> Email:
> WWW:
> Google: tv+raman 
> GTalk:,
> PGP:
------- End of Original Message -------

Received on Wednesday, 12 December 2007 04:24:10 UTC