- From: Juan E. Gilbert <gilbert@eng.auburn.edu>
- Date: Wed, 24 Nov 2004 17:54:29 -0600
- To: www-voice@w3.org, www-multimodal@w3.org
To Whom It May Concern: I am writing to give you a recommendation for voice and multimodality. I have run into a small problem when using speech recognition for multimodal applications. As you know, when speech recognition occurs, each word is assigned a confidence level. This is stored in an object, typically an XML object. It would be very nice if time information was stored as well. Time information falls under 2 categories. 1. For each recognized word, there should be a time stamp of when the confidence score was assigned or when the word was recognized. This time stamp could be obtained from the clock on the speech recognizer. 2. For each recognized word, there should be an elapsed time stamp. Elapsed time is the time captured from a stop watch. For example, when the recognizer is started a stop watch begins. When a word is recognized it is assigned a time stamp in milliseconds. Each successive word/recognition would have an increasing value in milliseconds. I think this informations is critical across all input modes for voice and multimodal processing. Speech recognition, gestures, etc. could all benefit from using both of these time stamp values. In fact, this is very easy to implement because all of this information is being used any way. For example, in SALT, babbletimeout and silence are using a stop watch. This would require 2 new object attributes that would exists next to the confidence level. I think this needs to be incorporated in all recognition and input standards. Basically, any place there is a confidence score, there should be these 2 time stamps. These time stamps will allow developers to process multimodal events with respect to time. Thanks, -- Juan E. Gilbert, Ph.D. Auburn University Human Centered Computing Lab - http://interact.cse.eng.auburn.edu/ Department of Computer Science and Software Engineering 107 Dunstan Hall Auburn, AL 36849-5347 U.S.A. (334) 844-6316 (O) (334) 844-6329 (F) gilbert@eng.auburn.edu http://www.eng.auburn.edu/~gilbert/
Received on Thursday, 25 November 2004 05:57:21 UTC