W3C home > Mailing lists > Public > www-voice@w3.org > October to December 2005

Re: Voice Recognition Profiles

From: Baggia Paolo <paolo.baggia@loquendo.com>
Date: Fri, 11 Nov 2005 11:52:43 +0100
Message-ID: <ED49E2ECA930E14FAB27B52D23033ABF012116D0@PTPEVS009BA020.idc.cww.telecomitalia.it>
To: "B.K. DeLong" <bkdelong@pobox.com>
Cc: "Baggia Paolo" <paolo.baggia@loquendo.com>, <www-voice@w3.org>
Dear ..,

I'd like to give you some more information on the background
of your proposals. 

There are at least two broad classes of ASR:
- telephony ASR
- dictation ASR

The former does not require any kind of training, because it is
designed to be used by all possible speakers of a given language,
so the ASR is using a general acoustic model trained on a large
population of speakers.

Conversely the latter is for a personal use, so the training
is used for improving the performances on given speaker. Even in
this field from a very long training session (reading predefined
sentences) the current version of dictation ASR are using general
acoustic models as a baseline, so the training needed is reduced.

For telephony ASR there are approaches to adapt online the acoustic
models to improve the performance of the actual speaker. This is done
during the course of the speech interaction, without the need of
an explicit training phase.

A second aspect is that it is very premature to speak of a
Voice Recognition Profile today. All the technologies are different
so it is almost impossible to hava a standard profile, but your
idea is in principle good.

THis is my personal opinion,
Paolo Baggia, Loquendo.

====================================================================
Voice Recognition Profiles

This message: [ Message body ] [ Respond ] [ More options ] 
Related messages: [ Next message ] [ Previous message ] 
From: B.K. DeLong <bkdelong@pobox.com> 
Date: Fri, 28 Oct 2005 08:26:32 -0400
Message-Id: <6.2.3.4.2.20051028081816.077769c8@mail.brain-stream.net> 
To: www-voice@w3.org 


I'm not sure if this is the right place to discuss this - I looked 
through the archives of this list and several TRs from the Voice 
activity and didn't really find anything to answer my question.

Have any efforts been made to make a standard for voice recognition 
training profiles? Is "training" even necessary any more for voice 
recognition systems?

So when I load up a voice recognition program, I am told to read 
several lines or paragraphs of text so it can match the text content 
with my voice. For every program I try, I have to retrain it all over 
again. In theory, if I move from my computer to my car and try to 
activate my GPS system by voice, it needs to be trained. If I go to 
an ATM or drive-thru where one can automatically order by voice, I 
need to spend several minutes correcting the system until I'm 
connected with a human operator.

Why not create a standard profile for voice recognition that all 
voice-recognition applications can use? That way, when I come to a 
new system I need to "train", I just type in my SSN or some other UID 
which tells the system to pull my VRP (Voice Recognition Profile), 
out of a centralized directory service, allowing me to immediately 
use the system.

In theory, each time I access a new service, whatever actions I take 
and corrections I make in the process, would be noted in the file for 
the next time I access a service - a live, constantly-growing, 
learning profile.

Does such a standard or technology effort exist?

--
B.K. DeLong
bkdelong@pobox.com
+1.617.797.8471 (Note new number)

http://www.brain-stream.com               Play.
http://www.bostonredcross.org            Volunteer.
http://www.the-leaky-cauldron.org        Potter.
http://www.hackerfoundation.org          Future.
http://www.wkdelong.org           Son.


PGP Fingerprint:
38D4 D4D4 5819 8667 DFD5  A62D AF61 15FF 297D 67FE

FOAF:
http://foaf.brain-stream.org 


Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia S.p.A.

================================================
CONFIDENTIALITY NOTICE
This message and its attachments are addressed solely to the persons above and may contain confidential information. If you have received the message in error, be informed that any use of the content hereof is prohibited. Please return it immediately to the sender and delete the message. Should you have any questions, please send an e_mail to <mailto:webmaster@telecomitalia.it>webmaster@telecomitalia.it. Thank you<http://www.loquendo.com>www.loquendo.com
================================================
Received on Friday, 11 November 2005 10:53:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 October 2006 12:49:01 GMT