- From: Daniel Weck via cvs-syncmail <cvsmail@w3.org>
- Date: Wed, 11 May 2011 08:17:11 +0000
- To: public-css-commits@w3.org
Update of /sources/public/csswg/css3-speech In directory hutz:/tmp/cvs-serv25316 Modified Files: Overview.html Overview.src.html Log Message: fixed volume amplitude scale, reorganized speak-as values to enable mixing types. Index: Overview.html =================================================================== RCS file: /sources/public/csswg/css3-speech/Overview.html,v retrieving revision 1.48 retrieving revision 1.49 diff -u -d -r1.48 -r1.49 --- Overview.html 11 May 2011 01:27:14 -0000 1.48 +++ Overview.html 11 May 2011 08:17:09 -0000 1.49 @@ -1,5 +1,4 @@ -<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" -"http://www.w3.org/TR/html4/strict.dtd"> +<!-- !DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd" --> <html lang=en> <head> @@ -529,13 +528,14 @@ <tr> <td> <em>Value:</em> - <td>[<non-negative number> linear?] | <percentage> | silent - | x-soft | soft | medium | loud | x-loud | inherit + <td>silent | [[x-soft | soft | medium | loud | x-loud] && + linear] | [<non-negative number> && linear] | + <percentage> | inherit <tr> <td> <em>Initial:</em> - <td>50 + <td>medium <tr> <td> <em>Applies to:</em> @@ -578,42 +578,71 @@ rel=biblioentry>[SPEECH-SYNTHESIS]<!--{{!SPEECH-SYNTHESIS}}--></a>. <dl> - <dt> <strong><non-negative number> (followed by the optional - "linear" keyword)</strong> + <dt> <strong>silent</strong> + + <dd> Specifies that the volume level results in no sound output at all. + + <dt> <strong>linear</strong> + + <dd> When present, this keyword indicates that the associated value + represents a point on a linear volume amplitude scale, from ‘<code + class=css>0</code>’ (silent) to ‘<code + class=css>100</code>’ (full volume). Otherwise, the scale + corresponds to monotonically non-decreasing volume levels from + ‘<code class=css>0</code>’ (minimum audible) to ‘<code + class=css>100</code>’ (maximum tolerable), with arbitrary + intermediary values that depend on the user environment (see the + definition of <non-negative number> below). + + <dt><strong>x-soft</strong>, <strong>soft</strong>, + <strong>medium</strong>, <strong>loud</strong>, and + <strong>x-loud</strong> + + <dd> This sequence of values corresponds to monotonically non-decreasing + volume levels. The value ‘<code class=property>x-soft</code>’ + maps to 0, ‘<code class=property>soft</code>’ maps to 25, + ‘<code class=property>medium</code>’ maps to 50, ‘<code + class=property>loud</code>’ maps to 75 and ‘<code + class=property>x-loud</code>’ maps to 100. The interpretation of + the corresponding numerical values depends on whether the ‘<code + class=property>linear</code>’ keyword is used (see the definition + of <non-negative number> below). + + <dt> <strong><non-negative number></strong> <dd>An integer or floating point <a href="#non-negative-number-def">positive number</a> in the range ‘<code class=css>0</code>’ to ‘<code - class=css>100</code>’, followed by the optional space character and - "linear" keyword. The interpretation of the ‘<code + class=css>100</code>’. The interpretation of the ‘<code class=css>0</code>’ to ‘<code class=css>100</code>’ - scale depends on whether the "linear" keyword is used. When not used: - ‘<code class=css>0</code>’ represents the <em>minimum - audible</em> level and ‘<code class=css>100</code>’ - corresponds to the <em>maximum tolerable</em> level. ‘<code - class=css>50</code>’ corresponds to the user's <em>preferred</em> - volume level. As such, the numerical values are mapped to concrete volume - levels that depend on the listening context. This allows authors to write - a single style sheet that should work in a variety of situations. - <p class=note>Note that actual volume levels depend on various factors, - such as the listening environment and personal user preferences. The - effective volume variation between ‘<code - class=css>0</code>’ and ‘<code class=css>100</code>’ - determines the dynamic range of the speech output, which is typically - compressed in a noisy environment (the volume corresponding to - ‘<code class=css>0</code>’ is nearer the value of - ‘<code class=css>100</code>’), whereas a noise-free context - allows for the full range of volume levels (the gap between ‘<code - class=css>0</code>’ and ‘<code class=css>100</code>’ - is wider). Conversely, there may be situations whereby both ‘<code + scale depends on whether the ‘<code + class=property>linear</code>’ keyword is used. + <p> When the ‘<code class=property>linear</code>’ keyword not + used, ‘<code class=css>0</code>’ represents the <em>minimum + audible</em> level, ‘<code class=css>100</code>’ corresponds + to the <em>maximum tolerable</em> level, and ‘<code + class=css>50</code>’ corresponds to the user's <em>preferred</em> + volume level. All 3 values are configured by the user, or at least + predefined by the user-agent. The numerical values on this scale are + mapped to concrete volume levels that depend on the user context, so + this allows authors to write a single style sheet that works in a + variety of listening environments.</p> + + <p> When the ‘<code class=property>linear</code>’ keyword is + specified, ‘<code class=css>0</code>’ maps to ‘<code + class=property>silent</code>’ and ‘<code + class=css>100</code>’ maps to the maximum possible audio volume + output (which depends on the user agent implementation, device + capabilities, etc.). The values in between ‘<code class=css>0</code>’ and ‘<code class=css>100</code>’ - are set to low volume levels (for example when listening discretely at - night).</p> - When the "linear" keyword is specified, ‘<code - class=css>0</code>’ maps to ‘<code - class=property>silent</code>’ and ‘<code - class=css>100</code>’ maps to the maximum possible audio volume - output. The values in between are placed on a linear amplitude scale. + are placed on a linear amplitude scale that do not necessarily match the + user's expectations, because it is independent from the user-configured + volume levels. For example, ‘<code class=css>50</code>’ may + not correspond to the user's <em>preferred</em> volume level, and it may + actually result in louder or softer audio output than desired. This + feature is provided to maintain compatibility with SSML (where + ‘<code class=property>x-soft</code>’ always means "silent", + etc.).</p> <dt> <strong><percentage></strong> @@ -624,42 +653,40 @@ <p class=note> Note that a leading "+" sign does not denote an increment. For example, +50% is equivalent to 50%, so the computed value equals the inherited value times 0.5 (divided by 2), then clipped to [0,100].</p> - - <dt> <strong>silent</strong> - - <dd>No sound output. - - <dt><strong>x-soft</strong>, <strong>soft</strong>, - <strong>medium</strong>, <strong>loud</strong>, and - <strong>x-loud</strong> - - <dd>The value ‘<code class=property>x-soft</code>’ maps to 0, - ‘<code class=property>soft</code>’ maps to 25, ‘<code - class=property>medium</code>’ maps to 50, ‘<code - class=property>loud</code>’ maps to 75 and ‘<code - class=property>x-loud</code>’ maps to 100. When the numerical - volume scale is linear, the sequence from ‘<code - class=property>x-soft</code>’ to ‘<code - class=property>x-loud</code>’ corresponds to monotonically - non-decreasing volume levels. </dl> <p class=note> Note that there is a difference between an element whose ‘<a href="#voice-volume"><code class=property>voice-volume</code></a>’ property has a value of - ‘<code class=property>silent</code>’ (or "0 linear"), and an - element whose ‘<a href="#speak"><code - class=property>speak</code></a>’ property has the value ‘<code - class=property>none</code>’. The former takes up the same time as if - it had been spoken, including any pause before and after the element, but - no sound is generated (although descendants can override the ‘<a - href="#voice-volume"><code class=property>voice-volume</code></a>’ - value and may therefore generate audio output). The latter requires no - time and is not rendered in the aural dimension - <!-- (including its descendants, which cannot override the inherited 'none' value). --> - (although descendants can override the ‘<a href="#speak"><code - class=property>speak</code></a>’ value and may therefore generate - audio output).</p> + ‘<code class=property>silent</code>’, and an element whose + ‘<a href="#speak"><code class=property>speak</code></a>’ + property has the value ‘<code class=property>none</code>’. The + former takes up the same time as if it had been spoken, including any + pause before and after the element, but no sound is generated (descendants + can override the ‘<a href="#voice-volume"><code + class=property>voice-volume</code></a>’ value and may therefore + generate audio output). Conversely, the latter requires no time and is not + rendered in the aural dimension (descendants can override the ‘<a + href="#speak"><code class=property>speak</code></a>’ value and may + therefore generate audio output). + + <p> + + <p class=note> Unless ‘<code class=property>linear</code>’ is + used, the actual volume levels resulting from the use of the numerical or + keyword values depend on various factors, such as the listening + environment and personal user preferences. The effective volume variation + between ‘<code class=css>0</code>’ and ‘<code + class=css>100</code>’ determines the dynamic range of the speech + output, which is typically compressed in a noisy environment (the volume + corresponding to ‘<code class=css>0</code>’ is nearer the + value of ‘<code class=css>100</code>’), whereas a noise-free + context allows for the full range of volume levels (the gap between + ‘<code class=css>0</code>’ and ‘<code + class=css>100</code>’ is wider). Conversely, there may be situations + whereby both ‘<code class=css>0</code>’ and ‘<code + class=css>100</code>’ are set to low volume levels (for example when + listening discretely at night).</p> <!-- p> 'voice-volume' does not apply to <a href ="#cue-props">audio cues</a> for which there is a separate means @@ -851,7 +878,14 @@ <p class=note> Note that ‘<code class=property>display</code>’ is the only property defined externally to this CSS3 module that affects behavior within the aural - "box" model.</p> + "box" model. Also note that the ‘<code + class=property>none</code>’ value of the ‘<code + class=property>display</code>’ property cannot be overridden by + descendants of the selected element, but the ‘<code + class=property>auto</code>’ value of ‘<a href="#speak"><code + class=property>speak</code></a>’ can however be overridden using + either of ‘<code class=property>none</code>’ or ‘<code + class=property>normal</code>’.</p> <dt> <strong>none</strong> @@ -859,42 +893,25 @@ actual content) to not be rendered (i.e., the element has no effect in the aural dimension). <p class=note> Note that any of the descendants of the affected element - are allowed to override this value, so they may actually take part in - the aural rendering. However, the pauses, cues, and rests of the - ancestor element remain "deactivated" in the aural dimension, and - therefore do not contribute to the <a href="#collapsing">collapsing of - pauses</a> or additive behavior of adjoining rests.</p> - <!-- - Descendant elements do not get rendered either; - this behavior cannot be overridden by setting the 'speak' property on the descendants. - --> - + are allowed to override this value, so descendants can actually take + part in the aural rendering despite using ‘<code + class=property>none</code>’ at this level. However, the pauses, + cues, and rests of the ancestor element remain "deactivated" in the + aural dimension, and therefore do not contribute to the <a + href="#collapsing">collapsing of pauses</a> or additive behavior of + adjoining rests.</p> <dt> <strong>normal</strong> - <dd> The element is rendered aurally. + <dd> The element is rendered aurally (regardless of its ‘<code + class=property>display</code>’ value and the ‘<code + class=property>display</code>’ and ‘<a href="#speak"><code + class=property>speak</code></a>’ values of its ancestors). + <p class=note> Note that using this value can result in the element being + rendered in the aural dimension even though it would not be rendered on + the visual canvas.</p> </dl> - <p class=note> Note that although the ‘<code - class=property>none</code>’ value of the ‘<code - class=property>display</code>’ property cannot be overridden by - descendants of the affected element, the ‘<code - class=property>auto</code>’ value of ‘<a href="#speak"><code - class=property>speak</code></a>’ can however be overridden by - descendants, using either of ‘<code - class=property>none</code>’ or ‘<code - class=property>normal</code>’. In the case of ‘<code - class=property>normal</code>’, this would result in descendants - being rendered in the aural dimension even though they would not be - rendered on the visual canvas. - <!-- To ensure that an element <em>and its descendants</em> do not get rendered in the aural dimension, - use the 'none' value for the 'speak' property. --> - </p> - <!-- p class="note"> - Note that the value of the 'visibility' property - may affect the computed value of 'voice-volume', but do not affect the 'speak' property. - </p --> - <h3 id=speaking-props-speak-as><span class=secno>4.2. </span>The ‘<a href="#speak-as"><code class=property>speak-as</code></a>’ property</h3> @@ -908,8 +925,8 @@ <tr> <td> <em>Value:</em> - <td>normal | spell-out | digits | literal-punctuation | no-punctuation | - inherit + <td>normal | spell-out || digits || [ literal-punctuation | + no-punctuation ] | inherit <tr> <td> <em>Initial:</em> @@ -944,13 +961,18 @@ <p>The ‘<a href="#speak-as"><code class=property>speak-as</code></a>’ property determines in what - manner text gets rendered aurally. + manner text gets rendered aurally, based upon a basic predefined list of + possible values. <p class=note> Note that the functionality provided by this property is related to the <a href="http://www.w3.org/TR/speech-synthesis/#edef_say-as"><code>say-as</code> element</a> from the SSML markup language <a href="#SPEECH-SYNTHESIS" - rel=biblioentry>[SPEECH-SYNTHESIS]<!--{{!SPEECH-SYNTHESIS}}--></a>. + rel=biblioentry>[SPEECH-SYNTHESIS]<!--{{!SPEECH-SYNTHESIS}}--></a>. Also + note that possible values are described in a <a + href="http://www.w3.org/TR/ssml-sayas">W3C note</a> separate from the SSML + specification, whereas the CSS Speech Module explicitly defines a list of + possible values. <dl> <dt> <strong>normal</strong> @@ -1607,13 +1629,7 @@ <dt> <strong>none</strong> - <dd>No auditory icon is specified.</dd> - <!-- dt><strong><non-negative number></strong></dt> - - <dd>An integer or floating point number in the range '0' to '100'. - '0' represents silence (the <em>minimum</em> level), and 100 - corresponds to the <em>maximum</em> level. The volume scale is - linear amplitude.</dd --> + <dd>Specifies that no auditory icon is used. <dt> <strong><percentage></strong> @@ -1640,19 +1656,14 @@ <dt> <strong>silent</strong> - <dd>No sound output.</dd> - <!-- dt><strong>silent</strong>, - <strong>x-soft</strong>, - <strong>soft</strong>, - <strong>medium</strong>, - <strong>loud</strong>, and - <strong>x-loud</strong></dt> - - <dd>A sequence of monotonically non-decreasing volume levels. - The value of 'silent' is mapped to '0' and 'x-loud' is mapped - to '100'. The mapping of other values to numerical volume levels - is implementation-dependent, but the intention is to match the - corresponding levels for 'voice-volume'.</dd--> + <dd> Specifies that the volume level results in no sound output at all. + <p class=note> Note that there is a difference between an audio cue whose + volume is set to ‘<code class=property>silent</code>’ and + one whose value is ‘<code class=property>none</code>’. In + the former case, the audio cue takes up the same time as if it had been + played, but no sound is generated. In the latter case, the there is no + manifestation of the audio cue at all (i.e. no time is allocated in the + aural dimension for the cue).</p> </dl> <div class=example> @@ -2096,10 +2107,10 @@ <p>The ‘<a href="#voice-rate"><code class=property>voice-rate</code></a>’ property manipulates the speed - of generated synthetic speech. The default rate for a given ‘<a - href="#voice-family"><code class=property>voice-family</code></a>’ - is processor-specific, and depends on the language, dialect and on the - "personality" of the voice. + of generated synthetic speech in terms of words per minute. The default + rate for a given ‘<a href="#voice-family"><code + class=property>voice-family</code></a>’ is processor-specific, and + depends on the language, dialect and on the "personality" of the voice. <p class=note> Note that the functionality provided by this property is related to the <a @@ -2125,6 +2136,8 @@ <dd>A sequence of monotonically non-decreasing speaking rates that are implementation and voice specific. + <p class=note>Note that typical values are (in words per minute) x-slow = + 80, slow = 120, medium = between 180 and 200, fast = 500.</p> </dl> <h3 id=voice-props-voice-pitch><span class=secno>8.3. </span>The ‘<a @@ -2737,9 +2750,17 @@ rules. This may be used for uncommon abbreviations or acronyms which are unlikely to be recognized by the synthesizer. The ‘<a href="#content-def"><code class=property>content</code></a>’ - property can be used to replace one string by another. In the following - example, the abbreviation is rendered using the content of the title - attribute instead of the element's content: + property can be used to replace one string by another. + + <p class=note> Note that the functionality provided by this property is + related to the <a + href="http://www.w3.org/TR/speech-synthesis/#edef_sub"><code>alias</code> + attribute of the <code>sub</code> element</a> from the SSML markup + language <a href="#SPEECH-SYNTHESIS" + rel=biblioentry>[SPEECH-SYNTHESIS]<!--{{!SPEECH-SYNTHESIS}}--></a>. + + <p> In the following example, the abbreviation is rendered using the + content of the title attribute instead of the element's content: <div class=example> <pre> @@ -3076,8 +3097,8 @@ <tr valign=baseline> <td><a class=property href="#speak-as">speak-as</a> - <td>normal | spell-out | digits | literal-punctuation | no-punctuation | - inherit + <td>normal | spell-out || digits || [ literal-punctuation | + no-punctuation ] | inherit <td>normal @@ -3203,10 +3224,11 @@ <tr valign=baseline> <td><a class=property href="#voice-volume">voice-volume</a> - <td>[<non-negative number> linear?] | <percentage> | silent - | x-soft | soft | medium | loud | x-loud | inherit + <td>silent | [[x-soft | soft | medium | loud | x-loud] && + linear] | [<non-negative number> && linear] | + <percentage> | inherit - <td>50 + <td>medium <td>all elements Index: Overview.src.html =================================================================== RCS file: /sources/public/csswg/css3-speech/Overview.src.html,v retrieving revision 1.49 retrieving revision 1.50 diff -u -d -r1.49 -r1.50 --- Overview.src.html 11 May 2011 01:27:14 -0000 1.49 +++ Overview.src.html 11 May 2011 08:17:09 -0000 1.50 @@ -1,4 +1,4 @@ -<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> +<!-- !DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd" --> <html lang="en"> <head> <title>CSS Speech Module</title> @@ -272,14 +272,14 @@ <td> <em>Value:</em> </td> - <td>[<non-negative number> linear?] | <percentage> | silent | x-soft | soft | - medium | loud | x-loud | inherit</td> + <td>silent | [[x-soft | soft | medium | loud | x-loud] && linear] | + [<non-negative number> && linear] | <percentage> | inherit</td> </tr> <tr> <td> <em>Initial:</em> </td> - <td>50</td> + <td>medium</td> </tr> <tr> <td> @@ -321,25 +321,45 @@ the <code>prosody</code> element</a> from the SSML markup language [[!SPEECH-SYNTHESIS]]. </p> <dl> <dt> - <strong><non-negative number> (followed by the optional "linear" keyword)</strong> + <strong>silent</strong> + </dt> + <dd> Specifies that the volume level results in no sound output at all. </dd> + <dt> + <strong>linear</strong> + </dt> + <dd> When present, this keyword indicates that the associated value represents a point on a + linear volume amplitude scale, from '0' (silent) to '100' (full volume). Otherwise, the + scale corresponds to monotonically non-decreasing volume levels from '0' (minimum audible) + to '100' (maximum tolerable), with arbitrary intermediary values that depend on the user + environment (see the definition of <non-negative number> below). </dd> + <dt><strong>x-soft</strong>, <strong>soft</strong>, <strong>medium</strong>, + <strong>loud</strong>, and <strong>x-loud</strong></dt> + <dd> This sequence of values corresponds to monotonically non-decreasing volume levels. The + value 'x-soft' maps to 0, 'soft' maps to 25, 'medium' maps to 50, 'loud' maps to 75 and + 'x-loud' maps to 100. The interpretation of the corresponding numerical values depends on + whether the 'linear' keyword is used (see the definition of <non-negative number> + below).</dd> + <dt> + <strong><non-negative number></strong> </dt> <dd>An integer or floating point <a href="#non-negative-number-def">positive number</a> in the - range '0' to '100', followed by the optional space character and "linear" keyword. The - interpretation of the '0' to '100' scale depends on whether the "linear" keyword is used. - When not used: '0' represents the <em>minimum audible</em> level and '100' corresponds to - the <em>maximum tolerable</em> level. '50' corresponds to the user's <em>preferred</em> - volume level. As such, the numerical values are mapped to concrete volume levels that depend - on the listening context. This allows authors to write a single style sheet that should work - in a variety of situations. <p class="note">Note that actual volume levels depend on various - factors, such as the listening environment and personal user preferences. The effective - volume variation between '0' and '100' determines the dynamic range of the speech output, - which is typically compressed in a noisy environment (the volume corresponding to '0' is - nearer the value of '100'), whereas a noise-free context allows for the full range of - volume levels (the gap between '0' and '100' is wider). Conversely, there may be - situations whereby both '0' and '100' are set to low volume levels (for example when - listening discretely at night). </p> When the "linear" keyword is specified, '0' maps to - 'silent' and '100' maps to the maximum possible audio volume output. The values in between - are placed on a linear amplitude scale. </dd> + range '0' to '100'. The interpretation of the '0' to '100' scale depends on whether the + 'linear' keyword is used. <p> When the 'linear' keyword not used, '0' represents the + <em>minimum audible</em> level, '100' corresponds to the <em>maximum tolerable</em> + level, and '50' corresponds to the user's <em>preferred</em> volume level. All 3 values + are configured by the user, or at least predefined by the user-agent. The numerical values + on this scale are mapped to concrete volume levels that depend on the user context, so + this allows authors to write a single style sheet that works in a variety of listening + environments. </p> + <p> When the 'linear' keyword is specified, '0' maps to 'silent' and '100' maps to the + maximum possible audio volume output (which depends on the user agent implementation, + device capabilities, etc.). The values in between '0' and '100' are placed on a linear + amplitude scale that do not necessarily match the user's expectations, because it is + independent from the user-configured volume levels. For example, '50' may not correspond + to the user's <em>preferred</em> volume level, and it may actually result in louder or + softer audio output than desired. This feature is provided to maintain compatibility with + SSML (where 'x-soft' always means "silent", etc.). </p> + </dd> <dt> <strong><percentage></strong> </dt> @@ -349,24 +369,23 @@ example, +50% is equivalent to 50%, so the computed value equals the inherited value times 0.5 (divided by 2), then clipped to [0,100]. </p> </dd> - <dt> - <strong>silent</strong> - </dt> - <dd>No sound output.</dd> - <dt><strong>x-soft</strong>, <strong>soft</strong>, <strong>medium</strong>, - <strong>loud</strong>, and <strong>x-loud</strong></dt> - <dd>The value 'x-soft' maps to 0, 'soft' maps to 25, 'medium' maps to 50, 'loud' maps to 75 - and 'x-loud' maps to 100. When the numerical volume scale is linear, the sequence from - 'x-soft' to 'x-loud' corresponds to monotonically non-decreasing volume levels. </dd> </dl> <p class="note"> Note that there is a difference between an element whose 'voice-volume' - property has a value of 'silent' (or "0 linear"), and an element whose 'speak' property has - the value 'none'. The former takes up the same time as if it had been spoken, including any - pause before and after the element, but no sound is generated (although descendants can - override the 'voice-volume' value and may therefore generate audio output). The latter - requires no time and is not rendered in the aural dimension - <!-- (including its descendants, which cannot override the inherited 'none' value). --> - (although descendants can override the 'speak' value and may therefore generate audio output). </p> + property has a value of 'silent', and an element whose 'speak' property has the value 'none'. + The former takes up the same time as if it had been spoken, including any pause before and + after the element, but no sound is generated (descendants can override the 'voice-volume' + value and may therefore generate audio output). Conversely, the latter requires no time and is + not rendered in the aural dimension (descendants can override the 'speak' value and may + therefore generate audio output). </p> + <p> </p> + <p class="note"> Unless 'linear' is used, the actual volume levels resulting from the use of the + numerical or keyword values depend on various factors, such as the listening environment and + personal user preferences. The effective volume variation between '0' and '100' determines the + dynamic range of the speech output, which is typically compressed in a noisy environment (the + volume corresponding to '0' is nearer the value of '100'), whereas a noise-free context allows + for the full range of volume levels (the gap between '0' and '100' is wider). Conversely, + there may be situations whereby both '0' and '100' are set to low volume levels (for example + when listening discretely at night). </p> <!-- p> 'voice-volume' does not apply to <a href ="#cue-props">audio cues</a> for which there is a separate means @@ -542,7 +561,9 @@ <dd> Resolves to a computed value of 'none' when 'display' is 'none', otherwise resolves to a computed value of 'auto' which yields a used value of 'normal'. <p class="note"> Note that 'display' is the only property defined externally to this CSS3 module that affects - behavior within the aural "box" model. </p> + behavior within the aural "box" model. Also note that the 'none' value of the 'display' + property cannot be overridden by descendants of the selected element, but the 'auto' value + of 'speak' can however be overridden using either of 'none' or 'normal'. </p> </dd> <dt> <strong>none</strong> @@ -550,32 +571,20 @@ <dd> This value causes an element (including pauses, cues, rests and actual content) to not be rendered (i.e., the element has no effect in the aural dimension). <p class="note"> Note that any of the descendants of the affected element are allowed to override this value, so - they may actually take part in the aural rendering. However, the pauses, cues, and rests - of the ancestor element remain "deactivated" in the aural dimension, and therefore do not - contribute to the <a href="#collapsing">collapsing of pauses</a> or additive behavior of - adjoining rests. </p> - <!-- - Descendant elements do not get rendered either; - this behavior cannot be overridden by setting the 'speak' property on the descendants. - --> + descendants can actually take part in the aural rendering despite using 'none' at this + level. However, the pauses, cues, and rests of the ancestor element remain "deactivated" + in the aural dimension, and therefore do not contribute to the <a href="#collapsing" + >collapsing of pauses</a> or additive behavior of adjoining rests. </p> </dd> <dt> <strong>normal</strong> </dt> - <dd> The element is rendered aurally. </dd> + <dd> The element is rendered aurally (regardless of its 'display' value and the 'display' and + 'speak' values of its ancestors). <p class="note"> Note that using this value can result in + the element being rendered in the aural dimension even though it would not be rendered on + the visual canvas. </p> + </dd> </dl> - <p class="note"> Note that although the 'none' value of the 'display' property cannot be - overridden by descendants of the affected element, the 'auto' value of 'speak' can however be - overridden by descendants, using either of 'none' or 'normal'. In the case of 'normal', this - would result in descendants being rendered in the aural dimension even though they would not - be rendered on the visual canvas. - <!-- To ensure that an element <em>and its descendants</em> do not get rendered in the aural dimension, - use the 'none' value for the 'speak' property. --> - </p> - <!-- p class="note"> - Note that the value of the 'visibility' property - may affect the computed value of 'voice-volume', but do not affect the 'speak' property. - </p --> <h3 id="speaking-props-speak-as">The 'speak-as' property</h3> <table class="propdef" summary="name: syntax"> <tbody> @@ -589,7 +598,8 @@ <td> <em>Value:</em> </td> - <td>normal | spell-out | digits | literal-punctuation | no-punctuation | inherit</td> + <td>normal | spell-out || digits || [ literal-punctuation | no-punctuation ] | + inherit</td> </tr> <tr> <td> @@ -629,10 +639,13 @@ </tr> </tbody> </table> - <p>The 'speak-as' property determines in what manner text gets rendered aurally.</p> + <p>The 'speak-as' property determines in what manner text gets rendered aurally, based upon a + basic predefined list of possible values.</p> <p class="note"> Note that the functionality provided by this property is related to the <a href="http://www.w3.org/TR/speech-synthesis/#edef_say-as"><code>say-as</code> element</a> - from the SSML markup language [[!SPEECH-SYNTHESIS]]. </p> + from the SSML markup language [[!SPEECH-SYNTHESIS]]. Also note that possible values are + described in a <a href="http://www.w3.org/TR/ssml-sayas">W3C note</a> separate from the SSML + specification, whereas the CSS Speech Module explicitly defines a list of possible values. </p> <dl> <dt> <strong>normal</strong> @@ -1219,13 +1232,7 @@ <dt> <strong>none</strong> </dt> - <dd>No auditory icon is specified.</dd> - <!-- dt><strong><non-negative number></strong></dt> - - <dd>An integer or floating point number in the range '0' to '100'. - '0' represents silence (the <em>minimum</em> level), and 100 - corresponds to the <em>maximum</em> level. The volume scale is - linear amplitude.</dd --> + <dd>Specifies that no auditory icon is used.</dd> <dt> <strong><percentage></strong> </dt> @@ -1243,19 +1250,13 @@ <dt> <strong>silent</strong> </dt> - <dd>No sound output.</dd> - <!-- dt><strong>silent</strong>, - <strong>x-soft</strong>, - <strong>soft</strong>, - <strong>medium</strong>, - <strong>loud</strong>, and - <strong>x-loud</strong></dt> - - <dd>A sequence of monotonically non-decreasing volume levels. - The value of 'silent' is mapped to '0' and 'x-loud' is mapped - to '100'. The mapping of other values to numerical volume levels - is implementation-dependent, but the intention is to match the - corresponding levels for 'voice-volume'.</dd--> + <dd> Specifies that the volume level results in no sound output at all. <p class="note"> Note + that there is a difference between an audio cue whose volume is set to 'silent' and one + whose value is 'none'. In the former case, the audio cue takes up the same time as if it + had been played, but no sound is generated. In the latter case, the there is no + manifestation of the audio cue at all (i.e. no time is allocated in the aural dimension + for the cue). </p> + </dd> </dl> <div class="example"> <pre> @@ -1676,9 +1677,9 @@ </tr> </tbody> </table> - <p>The 'voice-rate' property manipulates the speed of generated synthetic speech. The default - rate for a given 'voice-family' is processor-specific, and depends on the language, dialect - and on the "personality" of the voice.</p> + <p>The 'voice-rate' property manipulates the speed of generated synthetic speech in terms of + words per minute. The default rate for a given 'voice-family' is processor-specific, and + depends on the language, dialect and on the "personality" of the voice.</p> <p class="note"> Note that the functionality provided by this property is related to the <a href="http://www.w3.org/TR/speech-synthesis/#edef_prosody"><code>rate</code> attribute of the <code>prosody</code> element</a> from the SSML markup language [[!SPEECH-SYNTHESIS]]. </p> @@ -1695,7 +1696,9 @@ <dt><strong>x-slow</strong>, <strong>slow</strong>, <strong>medium</strong>, <strong>fast</strong> and <strong>x-fast</strong></dt> <dd>A sequence of monotonically non-decreasing speaking rates that are implementation and - voice specific.</dd> + voice specific. <p class="note">Note that typical values are (in words per minute) x-slow = + 80, slow = 120, medium = between 180 and 200, fast = 500. </p> + </dd> </dl> <h3 id="voice-props-voice-pitch">The 'voice-pitch' property</h3> <table class="propdef" summary="name: syntax"> @@ -2261,9 +2264,12 @@ <p>Sometimes, authors will want to specify a mapping from the source text into another string prior to the application of the regular pronunciation rules. This may be used for uncommon abbreviations or acronyms which are unlikely to be recognized by the synthesizer. The - 'content' property can be used to replace one string by another. In the following example, the - abbreviation is rendered using the content of the title attribute instead of the element's - content:</p> + 'content' property can be used to replace one string by another. </p> + <p class="note"> Note that the functionality provided by this property is related to the <a + href="http://www.w3.org/TR/speech-synthesis/#edef_sub"><code>alias</code> attribute of the + <code>sub</code> element</a> from the SSML markup language [[!SPEECH-SYNTHESIS]]. </p> + <p> In the following example, the abbreviation is rendered using the content of the title + attribute instead of the element's content:</p> <div class="example"> <pre> /* This replaces the content of the selected element @@ -2633,13 +2639,14 @@ own list of changes</a>, which - for succinctness - is not repeated here. </p> <ul> <li>Removed the "phonemes" property (and its associated "@alphabet" at-rule).</li> - <li>Renamed 'speakability' to 'speak', and 'speak' to 'speak-as'.</li> + <li>Renamed 'speakability' to 'speak', and 'speak' to 'speak-as'. Reorganized the 'speak-as' + values to allow mixing different types.</li> <li>Added support for lists (item styles, numbering, etc.).</li> <li>Adjusted the [initial] value for shorthand properties, to be consistent with other CSS specifications (i.e. "see individual properties"), and removed the erroneous "inherit" value.</li> - <li>Fixed numerical volume scale (and the associated "named" values). Also added the [silent] - value to audio cues.</li> + <li>Fixed numerical volume scale (and the associated "named" values) by adding the 'linear' + keyword. Also added the 'silent' value to audio cues.</li> <li>Fixed the [initial] values for 'pause' and 'rest', which should be zero (were "implementation-dependent").</li> <li>Corrected the [initial] values for 'voice-pitch-range' and 'voice-pitch' to "medium".</li>
Received on Wednesday, 11 May 2011 08:17:14 UTC