Re: [css3-speech] Editorial Comments from Daniel Weck on 2011-07-06 (www-style@w3.org from July 2011)

From: Daniel Weck <daniel.weck@gmail.com>
Date: Wed, 6 Jul 2011 17:13:14 +0100
To: "www-style@w3.org style" <www-style@w3.org>, fantasai <fantasai.lists@inkedblade.net>
Message-Id: <53FC3CC2-959F-4EC4-8953-292C3E0CD080@gmail.com>
On 30 Jun 2011, at 03:35, fantasai wrote:
> 1.2. Relationship with CSS2.1
>
>  # Content creators can conditionally include [...]
>
> I don't think this paragraph quite belongs under this section.
> Maybe in the previous section, or in its own section, or after
> the 1.3 CSS Speech Example section, I'm not sure. But it's not
> really about the relationship of CSS Speech and CSS2.1. :)

Yep, shifted into previous heading. :)

> 2. The aural "box" model
>
> Shift the <dfn> from the heading to the end of the first sentence.

That's better formatting indeed.

> Also, I would move the sentence from the intro that defines the
> aural canvas to this section (and throw in another <dfn> for that).
> This way the entire introduction is informative, and this definition,
> which is important for understanding the aural rendering model, is
> together with the rest of the definition of the model.

Yes, the prose reads better now.

> And then rename this section to "The aural formatting model", to
> parallel the "visual formatting model" in CSS2.1.

Done.

> 3.1. The ‘voice-volume’ property
>
> # <decibel>
> #
> #    An integer or floating point number immediately followed by
> #    "dB" (decibel unit).
>
> s/An integer or floating point number/A <number>/

I corrected this already, as a result of your previous email :)

> (We should get this moved into the CSS3 Values module, but since it's
> not there yet, it's fine to leave here.)

Right.

> 4.1. The ‘speak’ property
>
> # Note that ‘display’ is the only property defined externally to this
> # CSS3 module that affects behavior within the aural "box" model.
>
> This isn't actually true anymore, as list-style-type also has an  
> effect.
> I'd remove this sentence

Well-spotted.

> and instead add a reference to [[!CSS21]], which defines the  
> 'display' property, to the normative definition.

I already have a reference for it, but it somewhat didn't get  
generated. Fixed.

> 4.2. The ‘speak-as’ property
>
>  # Note that the functionality provided by this property is related
>  # to the say-as element from the SSML markup language [SSML]. Also
>  # note that possible values are described in a W3C Note ([SSML- 
> SAYAS])
>  # separate from the SSML specification, whereas the CSS Speech module
>  # explicitly defines a list of possible values.
>
> I think we can collapse this note to just
>
>  | Note that the functionality provided by this property is related
>  | to the say-as element from the SSML markup language [SSML], whose
>  | values are described in [SSML-SAYAS].

Nice and sweet. Fixed.

>  # Uses language-dependent pronunciation rules for rendering an
>  # element and its children.
>
> It doesn't actually control the children, since they're controlled by
> their own 'say-as' property, so this should be
>
>  | Uses language-dependent pronunciation rules for rendering the
>  | element's content.

That's correct :)

>  # literal-punctuation
>  #    Similar to ‘normal’ value, but punctuation such as semicolons,
>  #    braces, and so on are to be spoken literally.
>  # no-punctuation
>  #    Similar to ‘normal’ value but punctuation is not to be spoken
>  #    nor rendered as various pauses.
>
> Since these values can be combined with 'spell-out' and 'digits',  
> which
> would not be the same as 'normal', I suggest recasting the definitions
> as
>
>  | literal-punctuation
>  |    Punctuation such as semicolons, braces, and so on is named aloud
>  |    rather than rendered naturally as appropriate pauses.
>  | no-punctuation
>  |    Punctuation is not rendered: neither spoken nor rendered as  
> pauses.

Argh, I missed that! Thanks a lot.

> 5.1. The ‘pause-before’ and ‘pause-after’ properties
>
>  # <time>
>  #    Expresses the pause in absolute time units (seconds and  
> milliseconds,
>  #    e.g. "+3s", "250ms") as per the syntax of time values defined  
> in [CSS3VAL].
>
> Drop the "as per... [CSS3VAL]" portion of the sentence. Instead copy
>  http://dev.w3.org/csswg/css-module/#values
> into the spec, replacing
>  CSS Level 2 Revision 1 [CSS21]
> with
>  CSS Value and Units Level 3 [CSS3VAL]
> if needed. That'll define all the value definition across the spec  
> in one place.


Good idea. I added the section from the module template, and I  
corrected the "as per the syntax blabla" prose in the whole document.


> 5.3. Collapsing pauses
>
>  # For example, "strong" is selected
>
> Examples are class="example". :) Since it's just one sentence and  
> marked with the
> phrase "For example", you can also just leave it inline in the spec  
> per
>  http://dev.w3.org/csswg/css-module/#conventions

I was already using this convention, but missed this line. Thanks a  
bunch!

> 6.1. The ‘rest-before’ and ‘rest-after’ properties
>
> Same comment wrt <time> and [CSS3VAL] as for 'pause-before' and  
> 'pause-after'.

Yep, fixed everywhere now.

>  # This value can be used to inhibit a prosodic break which the  
> processor
>  # would otherwise produce.
>
> I think this sentence should be dropped. "none" should mean that  
> there is no
> rest, not that, e.g. the comma in
>
>  This, <span>phrase</span>
>
> is ignored.

Correct, it doesn't apply outside of the element (I removed the extra  
sentence to avoid confusion).

> 8.1. The ‘voice-family’ property
>
>  # Note that as a result, most punctuation characters, or digits at  
> the start
>  # of each token, must be escaped in unquoted voice names. For  
> example, the
>  # following declarations are invalid: [...]
>
> I suggest moving the invalid example somewhere further down, since  
> it is useful
> breaks the flow of trying to understand the property.
>
> Also, use
>  <div class="example">
>    <p>...
>    <pre></pre>
>  </div>
> instead of
>  <p class="note">
>  <div class="example"><pre>...</pre></div>


Okay.

>  # Note that to avoid mistakes in escaping, it is recommended to  
> quote voice
>  # names that contain white space, digits, or punctuation characters  
> other
>  # than hyphens. For example: [...]
>
> Again, I'd use
>
>  <div class="example">
>    <p>...
>    <pre>...</pre>
>  </div>
>
>  # voice-family: "john doe", "Henry the-8th";

Both examples are now below the property definition.

> Given both of those are valid if unquoted, how about:
>
>  voice-family: "Edward O'Connor", "Henry the 8th";
>
> which are not? :)

Because the whole point of the example is to demonstrate valid voice  
names for which quoting is not strictly necessary, but useful for  
reading clarity. Conversely, the previous example lists a number of  
annotated invalid declarations.

> I'll send a separate email on other voice-family issues...

I will answer separately then ;)

> 8.1.1. Voice selection, content language
>
> I'd rename the anchor "voice-selection", which avoids so many
> abbreviations...

Right.

> Item #4 in the list doesn't really belong in the list and should be
> a paragraph after it.

Fair point. Corrected.

> 8.2. The ‘voice-rate’ property
>
>  # Note that a leading "+" sign does not denote an increment, for
>  # example +50% is equivalent to 50%
>
> I don't think we need this note. This is standard behavior in CSS. :)

I have seen authors making this kind of mistake / false assumption  
many times, so I feel compelled to leave a note. However the  
application of percentages are not ambiguous in this specification,  
unlike frequencies (which can be absolute or relative)...so I agree to  
remove the note(s). :)

>  # Note that typical values are (in words per minute) x-slow = 80,
>  # slow = 120, medium = between 180 and 200, fast = 500.
>
> I assume this is for English? Might want to mention that. I imagine
> the value would be different for, e.g. Chinese vs. Hawaiian.

Right! I also changed the prose to an inline "For example".

> 8.3. The ‘voice-pitch’ property
>
>  # <frequency>
>  # Specifies the average pitch of the speaking voice using an absolute
>  #  value in frequency units (Hertz and kiloHertz, e.g. "100Hz",
>  # "+2kHz") as per the syntax of frequency values defined in  
> [CSS3VAL].
>
> Same comment as for 'pause-before' wrt "as per ... [CSS3VAL]".

Yep.

>  # Note that a leading "+" sign does not denote an increment.
>
> If you really need this note about the plus sign, move it into the
> comment about the pitch attribute in SSML, since in CSS this is
> standard behavior, and the confusion only arises if you're expecting
> SSML syntax.

I actually removed the note(s), as in fact my main concern was about  
absolute versus relative frequencies. Percentages are pretty much non- 
ambiguous here.

>  # For example, +50% is equivalent to 50%, so the computed value
>  # equals the inherited value times 0.5 (i.e. divided by 2), which
>  # is half the inherited average pitch of the voice.
>
> Now that we've covered the leading plus elsewhere, just convert this
> into a sentence tacked onto the definition:
>  | ... Computed values are calculated relative to the inherited
>  | value. For example, 50% equals the inherited value times 0.5,
>  | which is half the inherited average pitch of the voice.

Of course :)

>  # <relative-value>
>  #   Specifies a relative change (decrement or increment) to the
>  #   inherited value. The syntax of allowed values is a <number>,
>  #   followed immediately by either of "Hz" (for Hertz) or "kHz"
>  #   (for kiloHertz) or "st" (for semitones).
>  # relative
>  #   This keyword specifies that the provided value is expressed
>  #   relatively to another base value. This is in order to
>  #   disambiguate from absolute <frequency> values.
>
> I would drop the Hz definition from <relative-value> and only use
> semitones, and have the definition of the 'relative' keyword carry
> the relativeness:
>
>  voice-pitch: <frequency> && relative? | <relative-value> |  
> <percentage>
>
>  # relative
>  #   This keyword specifies that the provided <frequency> is expressed
>  #   as a relative change from the inherited value.

Right, I already applied this change following your previous email.

> 8.4. The ‘voice-pitch-range’ property
>
> Same comment as above for 'voice-pitch'.

Done.

>  # Note that a semitone is half of a tone (a half step) on the  
> standard
>  # diatonic scale. A semitone doesn't correspond to a fixed value in
>  # Hertz: instead, the ratio between two consecutive frequencies  
> separated
>  # by exactly one semitone is approximately 1.05946 (the twelfth  
> root of two).
>
> This shouldn't be a note. It should be a definition somewhere. Maybe
> your spec should have a Units section where it can define decibels
> and semitones and anything else it needs that's not in 2.1.

That's right. Definitely not a note.

> Also, unless "the twelfth root of two" is an approximation, change
>  approximately 1.05946 (the twelfth root of two)
> to
>  the twelfth root of two (approximately 1.05946)

Yep :)

> 9.1. The ‘voice-duration’ property
>
> Same comment as 'pause' wrt "as per ... [CSS3VAL]".

Done.

> 10. List items and counters styles
>
>  # the ‘list-style-type’ is used (if present).
>
> Drop "(if present)". A value for 'list-style-type' is always present.
>  http://www.w3.org/TR/CSS21/cascade.html#value-stages

Indeed!

>  # Note that the working draft of the CSS Lists module [CSS3LIST]
>  # contains new features which are not yet supported in this version
>  # of the CSS Speech module. Support for these features will be added
>  # later, when the CSS Lists draft stabilizes.
>
> This is a very time-based note. Just say that the speech rendering of
> new features from the CSS Lists and Counters Module Level 3 is not
> covered in this level of CSS Speech, but may be defined in a future
> specification. (Or remove the note.)

Yeah...much cleaner than my initial blurb.

> 11. Pronunciation, phonemes
>
>  # The W3C PLS (Pronunciation Lexicon Specification) recommendation
>  # ([PRONUNCIATION-LEXICON]) is one potential format to use with the
>  # "pronunciation" rel value, which allows importing pronunciation
>  # lexicons in HTML documents using the link element (similarly to
>  # how CSS stylesheets can be included).
>
> I think this should be split into two sentences, maybe something like
> this:
>
>  | The "pronunciation" rel value allows importing pronunciation  
> lexicons
>  | in HTML documents using the link element (similar to how CSS  
> stylesheets
>  | can be included). The W3C PLS (Pronunciation Lexicon Specification)
>  | [PRONUNCIATION-LEXICON] is one format that can be used to  
> describe such
>  | a lexicon.

Again, cleaner prose. Thanks!

> Also, since this section's purpose is to explain a design decision,  
> I'd
> shift this section after 12. Inserted and replaced content so that it
> can be removed at a future date without triggering a renumbering of  
> other
> sections.

Okay.

> 12. Inserted and replaced content
>
> This entire section should be marked non-normative, except for one  
> thing:
> the location of ::before and ::after wrt content and 'rest' needs to  
> be
> normative -- so put it in the section defining the aural box model.

Good point.

Many thanks again!
Dan
Received on Wednesday, 6 July 2011 16:13:54 UTC