- From: Daniel Weck <daniel.weck@gmail.com>
- Date: Wed, 11 May 2011 17:48:52 +0100
- To: fantasai <fantasai.lists@inkedblade.net>
- Cc: "www-style@w3.org" <www-style@w3.org>
Hi Fantasai, I forgot that SSML 1.1 changed the language handling algorithm ("voice" now has a "languages" attribute) [1]. I still believe that the prose in the current CSS Speech editor's draft is suitable [2], because I think that the language should *only* be specified in the content layer (SSML encapsulates both text content and aural presentation). Thoughts ? [1] http://www.w3.org/TR/speech-synthesis11/#edef_voice [2] http://dev.w3.org/csswg/css3-speech/#voice-props-voice-family On Wed, May 11, 2011 at 11:21 AM, Daniel Weck <daniel.weck@gmail.com> wrote: > All good points, thanks! > > I added the 'preserve' keyword value, a dedicated section to describe the > language-dependent voice selection mechanism, and a short example: > > http://dev.w3.org/csswg/css3-speech/#voice-props-voice-family > > On 2 May 2011, at 19:53, fantasai wrote: > >> The SSML spec gives an algorithm for selecting voice families: >> http://www.w3.org/TR/speech-synthesis/#edef_voice >> >> This algorithm is roughly approximated in the CSS3 Speech spec for >> 'voice-family': >> http://dev.w3.org/csswg/css3-speech/#voice-family >> >> # The ‘voice-family’ property is used to guide the selection of the voice >> to be >> # used for speech synthesis. The overriding priority is to match the >> language >> # specified by the xml:lang attribute as per the XML 1.0 specification >> [XML10], >> # and as inherited by nested elements until overridden by a further >> xml:lang >> # attribute. >> # >> # If there is no voice available for the requested value of xml:lang, the >> # processor should select a voice that is closest to the requested >> language >> # (e.g. a variant or dialect of the same language). If there are multiple >> # such voices available, the processor should use a voice that best >> matches >> # the values provided with the ‘voice-volume’ property. It is an error if >> # there are no such matches. >> >> Firstly, the prose here needs some tightening up. Copying the list >> structure >> from SSML is probably a good idea. >> >> Second, CSS doesn't use xml:lang directly, since CSS (unlike SSML) is not >> an >> XML language. Looking up "the language of the element" is an abstract >> operation; the closest thing we have to a definition is in Selectors Level >> 3: >> http://www.w3.org/TR/css3-selectors/#lang-pseudo >> >> Third, the SSML algorithm is somewhat imprecise about what "best matches" >> means. We either need a definition here, or we need a note that this is >> undefined. >> >> >> Lastly, we need to figure out, for CSS, when the voice family is >> recalculated. >> In SSML, it's recalculated on every element, which means that if an >> element >> has a different language value than its parent, the voice family changes. >> The >> SSML spec notes that this is not always desirable (e.g. a French phrase >> embedded in an English sentence) and in such cases suggests that the >> xml:lang >> attribute not indicate the language of the foreign phrase, thus avoiding >> the >> recalculation. >> >> This isn't particularly practical in CSS. We don't actually want to >> discourage >> people from marking up their documents correctly, even if many don't >> bother, >> and messing with the markup to change the speech rendering interferes with >> the >> separation of content and style. >> >> Probably the simplest solution would be to add a 'match-parent' keyword to >> 'voice-family'. This would add the 'match-parent' keyword to the inherited >> value for the computed value, and would prevent the voice selection from >> being recalculated. >> >> We could also consider something similar to the CSS3 Font's >> 'font-language-override' >> property, e.g. >> >> voice-language: auto | <language-code> | inherit; >> inherited: yes >> computed value: as specified >> >> auto - >> The used value is taken from the language of the element, or some >> UA-chosen value if unknown. (The computed value is the keyword 'auto'.) >> >> I'm somewhat less in favor of this option, as >> a) 'match-parent' seems easier to use (imho) >> b) 'match-parent' is just a keyword instead of an additional property >> c) you can do more intelligent things with 'match-parent' if you have the >> ability. E.g., use French phonics to map the embedded phrase to the >> closest English phonemes, so "à propos" could be rendered as >> "ah pro-POE" instead of "a PROP-uss". >> But it's something to consider. >> >> ~fantasai >> >> > > Daniel Weck > daniel.weck@gmail.com > > > >
Received on Wednesday, 11 May 2011 16:49:20 UTC