Re #2: [css3-speech] Editorial Comments

On 28 Apr 2011, at 08:49, fantasai wrote:
> 4. voice-balance
>
> +100 does not need to be called out separately from 100. This
> is all handled at the syntactic level; you don't need to address
> it here. (If you want to discuss 100 vs +100, then you might as
> well also discuss 100 vs 0100, the comparison of which operates
> at the same level.)

I agree. I reviewed the uses and fixed the non-uses of <time>,  
<frequency>, <number> and <non-negative number>, but I failed to spot  
this one. Thanks for pointing it out !

> # Many speech synthesizers only support a single channel. The
> # ‘voice-balance’ property can then be treated as part of a
> # post synthesis mixing step. This is where speech is mixed
> # with other audio sources.
>
> I think this point could use some clarification, or maybe an
> example.

What about this ?

"
Note that many speech synthesizers only generate mono sound, and  
therefore do not intrinsically support the 'voice-balance' property.  
The distribution of audio signals between left and right channels  
would consequently occur at post-synthesis stage, for example when a  
speech-enabled user-agent mixes the various audio sources that may be  
authored within the document.
"

> 6. Pause
>
> # The synthesis processor may insert a rest as part of its  
> implementation
> # of the prosodic break.
>
> This sentence seems weird and potentially confusing. The sentence  
> before
> it is poorly worded as well. I suggest replacing with
>
> | Expresses the pause by the strength of the prosodic break in speech
> | output. The exact time is implementation-dependent.

That's better indeed.

> Probably 'none' should be called out in a separate definition and  
> defined as equal to 0ms.

Good suggestion !

> # and can be used to inhibit a prosodic break which the processor
> # would otherwise produce
>
> I suggest removing this phrase since it implies that prosodic
> breaks introduced by punctuation here might also be removed.
> I don't think that's the intention.

It _is_ the intent:

http://www.w3.org/TR/speech-synthesis/#S3.2.3

> What might be useful is some discussion of the UA style sheet
> and how the author can override, e.g. the breaks between
> paragraphs by specifying
>  p { pause: none; }

Ok, I'll give it a stab.
=> TODO

> # "x-weak" and "x-strong" are mnemonics for "extra weak" and
> # "extra strong", respectively.
>
> If this note needs to be kept, it should be in a class="note".
> (I don't think it's really necessary to mention, though.)

Sure. Fixed.

> # The stronger boundaries are typically accompanied by pauses.
> # The breaks between paragraphs are typically much stronger than
> # the breaks between words within a sentence.
>
> This is UA stylesheet advice, and does not belong in the definition
> of the values.

Agreed. What about this note:

<p class="note">
Note that stronger content boundaries are typically accompanied by  
pauses. For example, the breaks between paragraphs are typically much  
more substantial than the breaks between words within a sentence.
</p>

I think this would help the reader understand the purpose of the  
property.

> 6.1 collapsing pauses
>
> s/Adjacent/Adjoining/ to be consistent with the collapsing  
> terminology.
>
> s/should be merged/are merged/ (this is not merely a recommendation)

Done.

> The "combination of a named break and time duration" sentence is  
> placed
> awkwardly... Maybe merge it in like this:
>
> | Adjoining pauses are merged by selecting the strongest named break  
> and
> | the longest absolute time interval. Thus "strong" is selected when
> | comparing "strong" and "weak", "1s" is selected when comparing "1s"
> | and "250ms", and "strong" and "250ms" take effect additively when
> | comparing "strong" and "250ms".

That's good.

> s/collapse:/are adjoining:/ seems like a good idea...

It makes more sense indeed.

> Also toss in
>
> | A collapsed pause is considered adjoining to another pause if any
> | of its component pauses is adjoining to that pause.
>
> (Taken from CSS2.1 8.3.1 Collapsing margins.)

Well spotted !

> # if the the "box" has a ‘voice-duration’ of "0ms" ... and no content.
>
> I think what's intended here is a voice-duration of 0ms *or* no  
> content.
> No?

Correct. To clarify, I added "and" and "or" terms.

> Also, s/no content/no rendered content/, since it may have content
> hidden by display: none.

Or by 'speakability' / 'speak' ;)

>
> The sentences about pauses being adjoining seem redundant with the
> sentences about pauses collapsing. Probably the latter should be
> removed?

Oops, some copy/paste oversight.

> 7. Rests
>
> See comments for 6. Pauses

Yes, I fixed that at the same time as 'pause' issues.

> s/additively/additively and do not collapse/ (just to be extra clear)

Sure.

More to come in part #3.

Received on Thursday, 28 April 2011 15:59:47 UTC