[wiki] Multimodal User Input

Collaborative Software Community Group,




On the topics of multimodality and document services, Document Services API could process multimodal user input, audio input, document object model elements from speech to XML (e.g. SSML) or data from speech recognition components (https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html, https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html#speechreco-resultlist).




Document services could refine speech recognition outputs (http://en.wikipedia.org/wiki/Outline_of_natural_language_processing#Component_processes_of_natural_language_understanding) and provide feedback on multimodal data, e.g. rate, projection, movement, vocal variety and prosody of spoken language, with use cases including enhanced speech recognition and facilitating public speaking exercises (see also: http://www.mooc-list.com/tags/public-speaking).










Kind regards,




Adam Sobieski

Received on Saturday, 12 April 2014 19:19:11 UTC