W3C home > Mailing lists > Public > public-css-commits@w3.org > April 2011

csswg/css3-speech Overview.html,1.35,1.36 Overview.src.html,1.36,1.37

From: Daniel Weck via cvs-syncmail <cvsmail@w3.org>
Date: Thu, 28 Apr 2011 17:09:14 +0000
To: public-css-commits@w3.org
Message-Id: <E1QFUiI-0003TU-1I@lionel-hutz.w3.org>
Update of /sources/public/csswg/css3-speech
In directory hutz:/tmp/cvs-serv13080

Modified Files:
	Overview.html Overview.src.html 
Log Message:
Fixed emphasis / stress (was missing a default "auto" value representing the default, speech-synthesizer-generated emphasis, which depends on languages)


Index: Overview.html
===================================================================
RCS file: /sources/public/csswg/css3-speech/Overview.html,v
retrieving revision 1.35
retrieving revision 1.36
diff -u -d -r1.35 -r1.36
--- Overview.html	28 Apr 2011 16:09:38 -0000	1.35
+++ Overview.html	28 Apr 2011 17:09:11 -0000	1.36
@@ -337,14 +337,14 @@
 p.peter { voice-balance: right; voice-family: male }
 p.goat  { voice-volume: soft }
 </pre>
-
-   <p>This will direct the speech synthesizer to speak headers in a voice (a
-    kind of "audio font") called "paul". Before speaking the headers, a sound
-    sample will be played from the given URL. Paragraphs with class "heidi"
-    will appear to come from the left (if the sound system is capable of
-    stereo), and paragraphs of class "peter" from the right. Paragraphs with
-    class "goat" will be played softly.</p>
   </div>
+
+  <p>This will direct the speech synthesizer to speak headers in a voice (a
+   kind of "audio font") called "paul". Before speaking the headers, a sound
+   sample will be played from the given URL. Paragraphs with class "heidi"
+   will appear to come from the left (if the sound system is capable of
+   stereo), and paragraphs of class "peter" from the right. Paragraphs with
+   class "goat" will be played softly.</p>
   <!-- p class="note">
 Note that the "aural" media type is deprecated, as defined in the informative CSS2.1 Aural appendix [[!CSS21]]).
 </p -->
@@ -1049,12 +1049,13 @@
 
   <h3 id=collapsing><span class=secno>6.1. </span>Collapsing pauses</h3>
 
-  <p>The pause defines the minimum distance of the aural "box" to the aural
+  <p> The pause defines the minimum distance of the aural "box" to the aural
    "boxes" before and after it. Adjoining pauses are merged by selecting the
-   strongest named break and the longest absolute time interval. Thus
-   "strong" is selected when comparing "strong" and "weak", "1s" is selected
-   when comparing "1s" and "250ms", and "strong" and "250ms" take effect
-   additively when comparing "strong" and "250ms".
+   strongest named break and the longest absolute time interval.
+
+  <p class=note> For example, "strong" is selected when comparing "strong"
+   and "weak", "1s" is selected when comparing "1s" and "250ms", and "strong"
+   and "250ms" take effect additively when comparing "strong" and "250ms".
 
   <p>The following pauses are adjoining:
 
@@ -2066,14 +2067,12 @@
      <td>specified value
   </table>
 
-  <p>Specifies variation in average pitch. The perceived pitch of a human
-   voice is determined by the fundamental frequency and typically has a value
-   of 120Hz for a male voice and 210Hz for a female voice. Human languages
-   are spoken with varying inflection and pitch; these variations convey
-   additional meaning and emphasis. Thus, a highly animated voice, i.e., one
-   that is heavily inflected, displays a high pitch range. This property
-   specifies the range over which these variations occur, i.e., how much the
-   fundamental frequency may deviate from the average pitch.
+  <p>Specifies variation in average pitch. Human languages are spoken with
+   varying inflection and pitch; these variations convey additional meaning
+   and emphasis. Thus, a highly animated voice, i.e., one that is heavily
+   inflected, displays a high pitch range. This property specifies the range
+   over which these variations occur, i.e., how much the fundamental
+   frequency may deviate from the average pitch.
 
   <p>Values have the following meanings:
 
@@ -2120,9 +2119,7 @@
   <p class=note> Note that a semitone is half of a tone (a half step) on the
    standard diatonic scale. A semitone doesn't correspond to a fixed value in
    Hertz: instead, the ratio between two consecutive frequencies separated by
-   exactly one semitone is approximately 1.05946 (the actual arithmetics
-   involved are beyond the scope of this specification, please refer to
-   existing literature on that subject).
+   exactly one semitone is approximately 1.05946 (the twelfth root of two).
 
   <table class=propdef summary="name: syntax">
    <tbody>
@@ -2134,12 +2131,12 @@
     <tr>
      <td><em>Value:</em>
 
-     <td>strong | moderate | none | reduced | inherit
+     <td>normal | strong | moderate | none | reduced | inherit
 
     <tr>
      <td><em>Initial:</em>
 
-     <td>moderate
+     <td>auto
 
     <tr>
      <td><em>Applies&nbsp;to:</em>
@@ -2167,27 +2164,57 @@
      <td>specified value
   </table>
 
-  <p>Indicates the strength of emphasis to be applied. Emphasis is indicated
+  <p>Indicates the strength of emphasis to be applied. Emphasis is applied
    using a combination of pitch change, timing changes, loudness and other
-   acoustic differences) that varies from one language to the next.
+   acoustic differences, and is dependent on the language being spoken.
 
   <p>Values have the following meanings:
 
   <dl>
-   <dt><strong>none</strong>, <strong>moderate</strong> and
-    <strong>strong</strong>
+   <dt><strong>auto</strong>
 
-   <dd>These are monotonically non-decreasing in strength, with the precise
-    meanings dependent on language being spoken. The value &lsquo;<code
-    class=property>none</code>&rsquo; inhibits the synthesizer from
-    emphasizing words it would normally emphasize.
+   <dd>Represents the default emphasis produced by the speech synthesizer.
+
+   <dt><strong>none</strong>
+
+   <dd>Inhibits the synthesizer from emphasizing words it would normally
+    emphasize.
+
+   <dt><strong>moderate</strong> and <strong>strong</strong>
+
+   <dd>These are monotonically non-decreasing in strength.
 
    <dt><strong>reduced</strong>
 
-   <dd>Effectively the opposite of emphasizing a word. For example, when the
-    phrase "going to" is reduced it may be spoken as "gonna".
+   <dd>Effectively the opposite of emphasizing a word.
   </dl>
 
+  <div class=example>
+   <p>Example:</p>
+
+   <pre>
+span.default-emphasis { voice-stress: auto; }
+span.lowered-emphasis { voice-stress: reduced; }
+span.removed-emphasis { voice-stress: none; }
+span.normal-emphasis { voice-stress: moderate; }
+span.huge-emphasis { voice-stress: strong; }
+
+...
+
+&lt;p&gt;This is a big car.&lt;/p&gt;
+&lt;!-- The speech output from the line above is identical to the line below: --&gt;
+&lt;p&gt;This is a &lt;span class="default-emphasis"&gt;big&lt;/span&gt; car.&lt;/p&gt;
+
+&lt;p&gt;This car is &lt;span class="lowered-emphasis"&gt;massive&lt;/span&gt;!&lt;/p&gt;
+&lt;!-- The "span" below is totally de-emphasized, whereas the emphasis in the line above is only reduced: --&gt;
+&lt;p&gt;This car is &lt;span class="removed-emphasis"&gt;massive&lt;/span&gt;!&lt;/p&gt;
+
+&lt;!-- The lines below demonstrate increasing levels of emphasis: --&gt;
+&lt;p&gt;This is a &lt;span class="normal-emphasis"&gt;big&lt;/span&gt; car!&lt;/p&gt;
+&lt;p&gt;This is a &lt;span class="huge-emphasis"&gt;big&lt;/span&gt; car!!!&lt;/p&gt;
+</pre>
+  </div>
+
   <h2 id=duration-props><span class=secno>10. </span>Duration property:
    &lsquo;<a href="#voice-duration"><code
    class=property>voice-duration</code></a>&rsquo;</h2>
@@ -2889,9 +2916,9 @@
     <tr valign=baseline>
      <td><a class=property href="#voice-stress">voice-stress</a>
 
-     <td>strong | moderate | none | reduced | inherit
+     <td>normal | strong | moderate | none | reduced | inherit
 
-     <td>moderate
+     <td>auto
 
      <td>all elements
 

Index: Overview.src.html
===================================================================
RCS file: /sources/public/csswg/css3-speech/Overview.src.html,v
retrieving revision 1.36
retrieving revision 1.37
diff -u -d -r1.36 -r1.37
--- Overview.src.html	28 Apr 2011 16:09:38 -0000	1.36
+++ Overview.src.html	28 Apr 2011 17:09:11 -0000	1.37
@@ -183,6 +183,7 @@
 p.peter { voice-balance: right; voice-family: male }
 p.goat  { voice-volume: soft }
 </pre>
+</div>
 
 <p>This will direct the speech synthesizer to speak headers in a
 voice (a kind of "audio font") called "paul". Before speaking the
@@ -827,14 +828,20 @@
 
 <h3 id="collapsing">Collapsing pauses</h3>
 
-<p>The pause defines the minimum distance of the aural "box" to the
+<p>
+The pause defines the minimum distance of the aural "box" to the
 aural "boxes" before and after it.
 Adjoining pauses are merged
 by selecting the strongest named break and
-the longest absolute time interval. Thus "strong" is selected when
+the longest absolute time interval.
+</p>
+
+<p class="note">
+For example, "strong" is selected when
 comparing "strong" and "weak", "1s" is selected when comparing "1s"
 and "250ms", and "strong" and "250ms" take effect additively when
-comparing "strong" and "250ms".</p>
+comparing "strong" and "250ms".
+</p>
 
 <p>The following pauses are adjoining:</p>
 
@@ -1760,10 +1767,8 @@
 </tbody>
 </table>
 
-<p>Specifies variation in average pitch. The perceived pitch of a
-human voice is determined by the fundamental frequency and
-typically has a value of 120Hz for a male voice and 210Hz for a
-female voice. Human languages are spoken with varying inflection
+<p>Specifies variation in average pitch.
+Human languages are spoken with varying inflection
 and pitch; these variations convey additional meaning and emphasis.
 Thus, a highly animated voice, i.e., one that is heavily inflected,
 displays a high pitch range. This property specifies the range over
@@ -1830,8 +1835,7 @@
 Note that a semitone is half of a tone (a half step) on the standard diatonic scale.
 A semitone doesn't correspond to a fixed value in Hertz: instead,
 the ratio between two consecutive frequencies separated by exactly one semitone
-is approximately 1.05946 (the actual arithmetics involved are beyond the scope of this specification,
-please refer to existing literature on that subject).
+is approximately 1.05946 (the twelfth root of two).
 </p>
 
 <table class="propdef" summary="name: syntax">
@@ -1842,11 +1846,11 @@
 </tr>
 <tr>
 <td><em>Value:</em></td>
-<td>strong | moderate | none | reduced | inherit</td>
+<td>normal | strong | moderate | none | reduced | inherit</td>
 </tr>
 <tr>
 <td><em>Initial:</em></td>
-<td>moderate</td>
+<td>auto</td>
 </tr>
 <tr>
 <td><em>Applies&nbsp;to:</em></td>
@@ -1872,29 +1876,55 @@
 </table>
 
 <p>Indicates the strength of emphasis to be applied. Emphasis
-is indicated using a combination of pitch change, timing changes,
-loudness and other acoustic differences) that varies from one
-language to the next.</p>
+is applied using a combination of pitch change, timing changes,
+loudness and other acoustic differences, and is dependent on the language being spoken.
+</p>
 
 <p>Values have the following meanings:</p>
 
 <dl>
-<dt><strong>none</strong>,
-<strong>moderate</strong> and
-<strong>strong</strong></dt>
 
-<dd>These are monotonically non-decreasing in strength, with
-the precise meanings dependent on language being spoken.
-The value 'none' inhibits the synthesizer from emphasizing
+<dt><strong>auto</strong></dt>
+<dd>Represents the default emphasis produced by the speech synthesizer.</dd>
+
+<dt><strong>none</strong></dt>
+<dd>Inhibits the synthesizer from emphasizing
 words it would normally emphasize.</dd>
 
-<dt><strong>reduced</strong></dt>
+<dt><strong>moderate</strong> and
+<strong>strong</strong></dt>
+<dd>These are monotonically non-decreasing in strength.
+</dd>
 
-<dd>Effectively the opposite of emphasizing a word. For example,
-when the phrase "going to" is reduced it may be spoken as
-"gonna".</dd>
+<dt><strong>reduced</strong></dt>
+<dd>Effectively the opposite of emphasizing a word.</dd>
 </dl>
 
+<div class="example">
+<p>Example:</p>
+<pre>
+span.default-emphasis { voice-stress: auto; }
+span.lowered-emphasis { voice-stress: reduced; }
+span.removed-emphasis { voice-stress: none; }
+span.normal-emphasis { voice-stress: moderate; }
+span.huge-emphasis { voice-stress: strong; }
+
+...
+
+&lt;p&gt;This is a big car.&lt;/p&gt;
+&lt;!-- The speech output from the line above is identical to the line below: --&gt;
+&lt;p&gt;This is a &lt;span class="default-emphasis"&gt;big&lt;/span&gt; car.&lt;/p&gt;
+
+&lt;p&gt;This car is &lt;span class="lowered-emphasis"&gt;massive&lt;/span&gt;!&lt;/p&gt;
+&lt;!-- The "span" below is totally de-emphasized, whereas the emphasis in the line above is only reduced: --&gt;
+&lt;p&gt;This car is &lt;span class="removed-emphasis"&gt;massive&lt;/span&gt;!&lt;/p&gt;
+
+&lt;!-- The lines below demonstrate increasing levels of emphasis: --&gt;
+&lt;p&gt;This is a &lt;span class="normal-emphasis"&gt;big&lt;/span&gt; car!&lt;/p&gt;
+&lt;p&gt;This is a &lt;span class="huge-emphasis"&gt;big&lt;/span&gt; car!!!&lt;/p&gt;
+</pre>
+</div>
+
 <h2 id="duration-props"><span class="secno">10. </span>Duration property:
 'voice-duration'</h2>
 
Received on Thursday, 28 April 2011 17:09:15 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 28 April 2011 17:09:16 GMT