RE: Question about using TTS via the prompt element in VXML from Brian Wyld on 2003-12-04 (www-voice@w3.org from October to December 2003)

From: Brian Wyld <brian.wyld@eloquant.com>
Date: Thu, 4 Dec 2003 10:02:14 +0100
To: "Tracy Boehrer" <tboehrer@calltower.com>, "Lutz Birkhahn" <lbirkhahn@adomo.com>, "Roopa Trivedi" <rotrived@cisco.com>, <www-voice@w3.org>
Message-ID: <OCENLGFFCDHPOEENHEPFMELAELAA.brian.wyld@eloquant.com>

Hi,

I think Lutz's is correct point of view - if there is a voice named, then use it, else use the 'characteristics' of the voice to select the best match.

One implementations point of view: (:-)
 - find TTS engine by language first (per service we allow only one engine type per locale, potentially on multiple servers) -> this makes sure that at least the voice selected has the right accent....
 - if voice name given, and exists, use it.
 - else, from list of all possible voices for the language, select by the ordered list { gender, age, varient}

Hence the selection algo uses the list in the order given in the spec.....
 
However it is not possible at the SSML level to select a specific vendor's engine.....

Brian

[Brian Wyld] [brian.wyld@eloquant.com]
[Directeur General R&D]
[Eloquant SA] [+33 476 77 46 92] [www.eloquant.com]
[advanced solutions for telecoms and IT services] 

> -----Message d'origine-----
> De : www-voice-request@w3.org [mailto:www-voice-request@w3.org]De la
> part de Tracy Boehrer
> Envoyé : Wednesday, December 03, 2003 23:51
> À : Lutz Birkhahn; Roopa Trivedi; www-voice@w3.org
> Objet : RE: Question about using TTS via the prompt element in VXML
> 
> 
> The SSML spec pretty much says that (section 2.2.1).  It even 
> states what attributes to use.  But I don't think it's clear 
> enough about the algorithm. In fact, it says "the voice selection 
> algorithm may be processor specific."
>  
> I would think it needs to be ordered.  For example: language, 
> gender, age, variant, name.  Or something like that.  Or, is the 
> order the spec lists it the right way: language, name, variant, 
> gender and age?
> 
> I don't know...
> 
> 
>  -----Original Message----- 
>  From: Lutz Birkhahn [mailto:lbirkhahn@adomo.com] 
>  Sent: Wed 12/3/2003 2:14 PM 
>  To: Roopa Trivedi; Tracy Boehrer; www-voice@w3.org 
>  Cc: 
>  Subject: Re: Question about using TTS via the prompt element in VXML
>  
>  
> 
>  Roopa Trivedi wrote:
>  > What about the scenario where 2 vendors support the same 
> gender, age,
>  > language etc and the script writer specifically wants to 
> use vendor #1?
>  > Is the "name" of the voice recommended to be used as the 
> distinguishing
>  > factor? Can we assume that this "name" will be unique across all
>  > vendors?
>  
>  I think voices in VXML are what fonts are in HTML (or TeX, or Post-
>  Script)... Maybe that's a good analogy when discussing this 
> question.
>  From that point of view, "name" of a voice is a possible 
> way to select
>  a voice, but that name should include the vendor ("foundry") to make
>  it unique ("Scansoft-Jim", "Viavoice-Rebecca"). One 
> difference is that
>  font names are usually protected, so when I ask for a Times 
> font, I can
>  expect to get something which looks at least somewhat similar to any
>  other Times font. I don't think the same holds for a voice 
> called "Jim".
>  ScanSoft's Jim might sound very different from IBM's Jim (although
>  both are probably male american voices).
>  
>  In general it would be nice to select a voice depending on 
> any mixture
>  of attributes, e.g. by specifying *some* of the attributes gender,
>  age(?), language, country, name, vendor, etc., much the same as e.g.
>  in X Window System I can specify a font similar to "*-Times-bold-*"
>  when I don't care where the Times font comes from, or I can specify
>  "Adobe-Times-bold-*" when I insist on having the Adobe 
> voice err font.
>  
>  Just my $.02,
>  
>  /lutz
>  
>  --
>  Lutz Birkhahn                                       System 
> Software Engineer
>  Adomo Inc. --  10001 N. De Anza Boulevard Suite 220  --  
> Cupertino, CA 95014
>  
> 
>

Received on Thursday, 4 December 2003 04:03:15 UTC