[css3-speech] RE: Heads-up: CSS WG plans last call for css3-speech from paul.bagshaw@orange-ftgroup.com on 2011-08-18 (www-style@w3.org from August 2011)

From: <paul.bagshaw@orange-ftgroup.com>
Date: Thu, 18 Aug 2011 11:44:14 +0200
To: <bert@w3.org>, <www-style@w3.org>
Cc: <w3c-voice-wg@w3.org>
Message-ID: <8E09C72DBC577D489F13A71228C0B7BF015EAFC5@ftrdmel0.rd.francetelecom.fr>

Bert,

 

In response to your recent call for comments on the CSS Speech Module, I have made a personal review of the spec. Please note that my comments have not been seen or discussed by the Voice Brower WG, and as such may not represent the opinion of the group.

 

1. Interaction between the 'voice-volume' and 'cue' properties.

 

Please note that in SSML 1.1 the attributes of the <ssml:prosody> element affect the rendering "of the contained text"; they do not have an effect on child <audio> elements. Note therefore that the 'volume' attribute of the <ssml:prosody> element and the 'soundLevel' attribute of the <ssml:audio> element are intentionally independent. This enables the perceived loudness of speech synthesised from text to be balanced with that of speech in pre-recorded audio cues.

 

The CSS-Speech module states that 'voice-volume' is related to <ssml:prosody>'s 'volume' attribute, and that the 'cue' properties are related to <ssml:audio> (inferring its 'soundLevel' attribute). It also states that the <decibel> value of the 'cue' properties "represents a change (positive or negative) relative to the computed value of the ‘voice-volume’ property".

 

Authors often have no control over the volume level of the source (initial waveform) of pre-recorded audio cues, and never have control over the source of speech synthesis waveforms whose loudness differs between speech engines and voices. However, the CSS-Speech module makes the impractical suggestion that authors control the volume level of audio cue waveforms in order the balance them with speech rendered from text.

 

I suggest that the CSS-Speech module follows the SSML 1.1 paradigm and that the 'voice-volume' and 'cue' properties should not interact.

 

With regards,

Paul Bagshaw

Co-author of SSML 1.1 and PLS 1.0.

 

-----Original Message-----
From: w3c-voice-wg-request@w3.org [mailto:w3c-voice-wg-request@w3.org] On Behalf Of Bert Bos
Sent: Sunday, August 14, 2011 12:32 AM
To: w3c-wai-pf@w3.org; w3c-voice-wg@w3.org; member-xg-htmlspeech@w3.org; wai-xtech@w3.org
Cc: chairs@w3.org
Subject: Heads-up: CSS WG plans last call for css3-speech

 

Hello chairs,

 

The CSS WG decided to issue a last call for the CSS Speech Module. We're planning to publish next week, with a deadline for comments of 30 September, i.e., about 6 weeks.

 

Please, let us know if that deadline is too soon.

 

We'd especially like to hear from

 

  - WAI PF and/or HTML Accessibility TF

  - Voice Browser WG

  - HTML Speech XG

 

The latest editor's draft is here:

 

    http://dev.w3.org/csswg/css3-speech/


 

(The content is what will be published, after reformatting for Working Draft.)

 

The CSS Speech module contains properties to style the rendering of documents via a speech synthesizer: voice, volume, speed, pitch, pauses, etc. It is designed to be compatible with SSML, i.e., the rendering of the document could be in the form of an SSML stream.

 

 

 

For the CSS WG,

 

Bert

-- 

  Bert Bos                                ( W 3 C ) http://www.w3.org/


  http://www.w3.org/people/bos                               W3C/ERCIM

  bert@w3.org                             2004 Rt des Lucioles / BP 93

  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France

Received on Thursday, 18 August 2011 09:44:45 UTC