- From: Daniel Weck <daniel.weck@gmail.com>
- Date: Tue, 24 May 2011 12:04:02 +0100
- To: www-style list <www-style@w3.org>, timeless <timeless@gmail.com>
Thank you for your review! Reply inline: On 18 May 2011, at 11:34, timeless wrote: > http://dev.w3.org/csswg/css3-speech/ > >> (e.g. TTS voice, pitch, rate, volume levels, etc.) > > drop 'etc.' it's incompatible w/ 'e.g.' (and add 'and' before > 'volume levels') Fixed. >> These style sheet properties can be used together with visual >> properties (mixed media), or as a complete aural alternative to >> visual presentation. > > perhaps 'to a/the visual presentation'? Good suggestion. >> This Module describes the CSS properties that apply to the "speech" >> media type, and defines a new "box" model specifically for the >> aural dimension. > > s/Module/module/ Done. >> Note that content creators can conditionally include CSS properties >> dedicated to user-agents with text to speech synthesis > > should this be in <p class=note> ? as is, for some reason you don't > seem to have margins between <p>'s which makes it look like you just > have a <br> Local CSS stylesheet improved. >> When doing so, the styles authored within the scope of such >> conditional statements are ignored by user-agents that do not >> support speech synthesis. > > s/speech synthesis/css3-speech/ (or "this Module") Used "this module". >> linear >> When present, this keyword indicates that the associated value >> represents a point on a linear volume amplitude scale, from >> ‘0’ (silent) to ‘100’ (full volume). > >> x-soft >> The value ‘x-soft’ maps to 0 >> The interpretation of the corresponding numerical values depends on >> whether the ‘linear’ keyword is used > > That x-soft might map to silent seems odd. > > Initially I wrote: > I understand the goal is an even distribution, but it seems that a > value that might represent silent shouldn't be labeled as 'soft', i > think 10, 30, 50, 70, 90 would be better, either including 'none' and > 'loudest' for 0/100 or just leaving those values to be written out by > hand. > > I think that you should probably include the explanation you included > in <non-negative number> about designing for compatibility with SSML. >> <non-negative number> >> An integer or floating point positive number in the range ‘0’ to >> ‘100’. > > It seems better to call this a <something-percentage>. I don't think > defining non negative to be bounded above by 100 makes sense. > > Of note, you use 'non-negative' here. > >> When the ‘linear’ keyword not used > > s/not/is not/ The whole volume level issue has been re-worked on based on SSML 1.1 (this is one of the breaking changes since v1.0). I had made some editorial mistakes by combining aspects of SSML 1.0 and details from the CSS 2.1 Aural Stylesheets appendix.. > Could you please do something to the style so that two normal <p>'s > when placed adjacent to each-other have margins? your primary audience > might be css3-speech users, but.. Already fixed, as per your comment earlier in this email. :) >> All 3 values are configured by the user > s/configured/potentially configurable/ >> so this allows authors to write a single style sheet that works in >> a variety of listening environments. > s/so// >> because it is independent from the user-configured volume levels. > ? s/from/of/ > -- I'm not sure on this point, my suggestion is because to me you're > saying that while they could be mathematically related, they aren't > (thus "of"). > I think "not directly related to" is probably a better solution > >> (where ‘x-soft’ always means "silent", etc.). > > drop ", etc." ? Thanks (to all 4 points above). >> <percentage> >> Only positive percentage values are allowed. > > I think you want 'non-negative' not 'positive', as '0' is allowed. Thanks, I checked the entire document for this error. >> so the computed value equals the inherited value times 0.5 (divided >> by 2), > > s/divided/i.e. divided/ Ok. >> (the volume corresponding to ‘0’ is nearer the value of ‘100’) >> (the gap between ‘0’ and ‘100’ is wider). > > i don't think 'nearer' / 'wider' are good choices for this description "closer to" ? This prose/specification has been completely revamped anyway. >> normal >> Punctuation is not to be spoken, but instead rendered naturally as >> various pauses. > > shouldn't punctuation also affect tone, volume, stress, etc.? Proposed replacement: "For example, punctuation is not spoken as-is, but instead rendered naturally as appropriate pauses." >> <time> >> Only positive values are allowed. > > s/positive/non-negative/ ? Fixed everywhere. >> none >> Equivalent to 0ms (no prosodic break in the speech output). > >> The ‘cue-before’ and ‘cue-after’ properties specify auditory icons >> (i.e. prerecorded audio clips) to be played before (or after) the >> selected element within the audio "box" model. When a user agent is >> not able to render the specified auditory icon, it is recommended >> to produce an alternative cue (e.g., popping up a warning, emitting >> a warning sound, etc.) > > You're missing a period at the end of this paragraph Yes, and removed "etc." too :) >> The URI must designate an auditory icon resource. If the URI >> resolves to something other than an audio file, such as an image, >> the resource is ignored and the property treated as if it had the >> value ‘none’. > > must sounds like an rfc term, which is probably not proper in this > context. I see. >> The loudness of prerecorded audio cues can be adjusted relatively >> to the volume level of synthetic speech. > > s/relatively/relative/ Yep. > synthetic or synthesized? > (possibly "speech synthesis") This looks alright: http://www.google.co.uk/search?q=synthetic+speech >> Only positive percentage values are allowed. > > non-negative? Fixed everywhere. >> The ‘voice-family’ property specifies a comma-separated, >> prioritized list of values that designate speech synthesis voices. > > s/voices./voices/ -- otherwise you have a random stray period after > the parenthetical: Well spotted. >> (analog to ‘font-family’ in visual style sheets). > > s/analog/analogous/ Done. >> <name> >> For compatibility with SSML, whitespace characters are not >> permitted within voice names. > > This should probably be listed earlier in the paragraph. And it's > probably better as "voice names must not contain whitespace > characters". Good suggestion. >> <age> >> Possible values are ‘child’, ‘young’ and ‘old’. > > to me, 'age' is numeric, i'd suggest you use some other thing to > describe the textual concepts. you're also missing something for > 'normal'. To be honest, I am not aware of the historical motivations to use a keyword enumeration rather than a non-negative number like in SSML: http://www.w3.org/TR/speech-synthesis11/#edef_voice So far I haven't seen any implementation of ‘child’, ‘young’ and ‘old’, so I am totally in favor of aligning with SSML. Latest editor's draft updated accordingly. >> Possible values are positive numbers restricted to integers, and >> excluding zero (i.e. starting from 1). > > This is rather convoluted. You defined Positive numbers to include 0 > reference that definition and then actively exclude zero. Actually we refer to http://www.w3.org/TR/css3-values/#non-negative I fixed the erroneous prose that was pointing to "positive" numbers when it should have referred to "non-negative" numbers. >> (e.g. name, gender, age, etc.). > > drop "etc." Yep, as per you earlier recommendation. >> in order to cater for dialectic variants): . > > s/for/to/ Several dictionaries (e.g. the Collins) allow "for" after "cater" to express the following meaning: "take into account, consider, bear in mind, make allowance for, to supply what is needed, etc." > s/: ./:/ Keyboard slippage :) >> If no voice is available for the language of the selected content, >> user-agent should raise a warning to let the user know about the >> lack of appropriate TTS voice. > > While this is a should instead of a must, I'm not certain it's a > wonderful suggestion. UI design via specification especially in the > area of warnings is generally poor. I'd suggest 'may'. Reworded. >> The speech synthesizer voice must be re-evaluated (i.e. the >> selection process must take place once again) whenever either of >> the CSS voice characteristics change within the content flow. > > s/either/any/ Right. > I'm concerned by 're-evaluated' + 'when*' -- This document talks about > a single directed flow, and I'd want UAs to have the option of > applying the selection process at "layout" instead of at "rendering". > Otherwise you risk asking a UA to compute something while it's > reading, creating an unexpectedly long pause between potential voice > transitions. I added a note to clarify this point. >> The voice must also be re-calculated whenever the content language >> changes, unless the ‘preserve’ keyword is used > > It'd be nice if a css selector based example was provided instead of a > forced rule on the node. I added another "span" in the example. >> The french text below will be spoken with an english voice: > > s/french/French/; s/english/English speaker's/ Fixed. >> 8.3. The ‘voice-pitch’ property >> Value: <frequency> | <percentage> | <relative-change> | x-low | >> low | medium | high | x-high | inherit >> <relative-change> >> Specifies a relative change (decrement or increment) to the >> inherited value. The syntax of allowed values is a <number> (the >> "+" sign is optional for positive numbers), followed by either of >> "Hz" (for Hertz) or "kHz" (for kiloHertz) or "st" (for semitones), >> and followed by a space character and the "relative" keyword. > > It seems like: > > | <relative-value> relative | > > would be much easier to understand than an extra sentence hidden at > the end of the text. Yes, this was already on my todo list. Actually: | <relative-value> && relative | Many thanks !! Dan
Received on Tuesday, 24 May 2011 11:04:36 UTC