- From: Daniel Weck <daniel.weck@gmail.com>
- Date: Wed, 11 May 2011 11:21:53 +0100
- To: fantasai <fantasai.lists@inkedblade.net>
- Cc: "www-style@w3.org" <www-style@w3.org>
All good points, thanks! I added the 'preserve' keyword value, a dedicated section to describe the language-dependent voice selection mechanism, and a short example: http://dev.w3.org/csswg/css3-speech/#voice-props-voice-family On 2 May 2011, at 19:53, fantasai wrote: > The SSML spec gives an algorithm for selecting voice families: > http://www.w3.org/TR/speech-synthesis/#edef_voice > > This algorithm is roughly approximated in the CSS3 Speech spec for > 'voice-family': > http://dev.w3.org/csswg/css3-speech/#voice-family > > # The ‘voice-family’ property is used to guide the selection of the > voice to be > # used for speech synthesis. The overriding priority is to match the > language > # specified by the xml:lang attribute as per the XML 1.0 > specification [XML10], > # and as inherited by nested elements until overridden by a further > xml:lang > # attribute. > # > # If there is no voice available for the requested value of > xml:lang, the > # processor should select a voice that is closest to the requested > language > # (e.g. a variant or dialect of the same language). If there are > multiple > # such voices available, the processor should use a voice that best > matches > # the values provided with the ‘voice-volume’ property. It is an > error if > # there are no such matches. > > Firstly, the prose here needs some tightening up. Copying the list > structure > from SSML is probably a good idea. > > Second, CSS doesn't use xml:lang directly, since CSS (unlike SSML) > is not an > XML language. Looking up "the language of the element" is an abstract > operation; the closest thing we have to a definition is in Selectors > Level 3: > http://www.w3.org/TR/css3-selectors/#lang-pseudo > > Third, the SSML algorithm is somewhat imprecise about what "best > matches" > means. We either need a definition here, or we need a note that this > is > undefined. > > > Lastly, we need to figure out, for CSS, when the voice family is > recalculated. > In SSML, it's recalculated on every element, which means that if an > element > has a different language value than its parent, the voice family > changes. The > SSML spec notes that this is not always desirable (e.g. a French > phrase > embedded in an English sentence) and in such cases suggests that the > xml:lang > attribute not indicate the language of the foreign phrase, thus > avoiding the > recalculation. > > This isn't particularly practical in CSS. We don't actually want to > discourage > people from marking up their documents correctly, even if many don't > bother, > and messing with the markup to change the speech rendering > interferes with the > separation of content and style. > > Probably the simplest solution would be to add a 'match-parent' > keyword to > 'voice-family'. This would add the 'match-parent' keyword to the > inherited > value for the computed value, and would prevent the voice selection > from > being recalculated. > > We could also consider something similar to the CSS3 Font's 'font- > language-override' > property, e.g. > > voice-language: auto | <language-code> | inherit; > inherited: yes > computed value: as specified > > auto - > The used value is taken from the language of the element, or some > UA-chosen value if unknown. (The computed value is the keyword > 'auto'.) > > I'm somewhat less in favor of this option, as > a) 'match-parent' seems easier to use (imho) > b) 'match-parent' is just a keyword instead of an additional property > c) you can do more intelligent things with 'match-parent' if you > have the > ability. E.g., use French phonics to map the embedded phrase to > the > closest English phonemes, so "à propos" could be rendered as > "ah pro-POE" instead of "a PROP-uss". > But it's something to consider. > > ~fantasai > > Daniel Weck daniel.weck@gmail.com
Received on Wednesday, 11 May 2011 10:22:24 UTC