Re: [css3-speech] voice-family from fantasai on 2011-08-01 (www-style@w3.org from August 2011)

From: fantasai <fantasai.lists@inkedblade.net>
Date: Mon, 01 Aug 2011 10:20:48 -0700
To: Daniel Weck <daniel.weck@gmail.com>
CC: www style <www-style@w3.org>
Message-ID: <4E36E070.1020803@inkedblade.net>

On 08/01/2011 09:40 AM, Daniel Weck wrote:
>
> On 20 Jul 2011, at 23:00, fantasai wrote:
>
>> On 07/06/2011 12:54 PM, Daniel Weck wrote:
>>> Please have a look at the updated prose:
>>>
>>> http://dev.w3.org/csswg/css3-speech/#voice-props-voice-family
>>
>> I think my concern here is that using numerical ages gives a level of
>> precision in specifying that is nowhere near the level of precision
>> in voice matching. For example, at what numerical age does a male voice
>> break?
>>
>> I think for this level it might make sense to revert back to keywords
>> (which we can define as a specific numeric age for mapping to SSML),
>> and introduce more fine-grained control later when the voice-matching
>> algorithm is precise enough to support that.
>
> I agree that we should avoid using prose that appears to claim a level
> of precision that we are effectively unable to provide. I propose the
> following prose instead:
>
> ---
> Possible values are 'child', 'young' and 'old', indicating the preferred
> age category to match during voice selection. The mapping with SSML ages
> is defined as follows: 'child' = up to 15 y/o, 'young' = between 16 and
> 45 y/o, 'old' = 46 y/o onwards.
> NOTE: The interpretation of the relationship between a person's age and
> a recognizable type of voice cannot realistically be defined in a universal
> manner, as it effectively depends on numerous cultural and linguistic
> variations. The values provided by this specification therefore represent
> a simplified model that can be reasonably applied to a great variety of
> speech locales, albeit at the cost of a certain degree of approximation.
> Future versions of this specification may refine the level of precision
> of the voice-matching algorithm, as speech processor implementations
> become more standardized.
> ---

How about just mapping the keywords to specific numbers, and letting the
voice-matching algorithm figure out the slack?
   'child' = 6 years old
   'young' = 24 years old
   'old'   = 75 years old
or somesuch

~fantasai

Received on Monday, 1 August 2011 17:21:22 UTC