- From: Dan Burnett <dburnett@voxeo.com>
- Date: Thu, 21 Oct 2010 20:05:26 -0400
- To: public-xg-htmlspeech@w3.org
Group, I've spent some time looking over the results from the survey. There are several different options for discussion order that we could use. For the purposes of determining consensus the most useful ordering is one that reflects the likely amount of consensus. The requirements are ordered below by the standard deviation of the scores. The means are also given for reference. These values include the inputs from Olli. Rank STDEV MEAN Requirement 1 2.89 4.00 R29. Web application may only listen in response to user action. 2 2.88 4.67 R31. End users, not web application authors, should be the ones to select speech recognition resources. 3 2.51 5.57 R33. User agents need a way to enable end users to grant permission to an application to listen to them. 4 2.51 4.50 R1. Web author needs full control over specification of speech resources.* 5 2.43 4.25 R18. User perceived latency of synthesis must be minimized.* 6 2.38 5.00 R17. User perceived latency of recognition must be minimized.* 7 2.27 4.14 R30. End users should not be forced to store anything about their speech recognition environment in the cloud. 8 2.27 3.14 R28. Web application must not be allowed access to raw audio. 9 2.19 4.75 R16. Web application authors must not be excluded from running their own speech service.* 10 2.15 3.57 R26. There should exist a high quality default speech recognition visual user interface.* 11 2.14 5.71 R23. Speech as an input on any application should be able to be optional.* 12 2.14 4.71 R15. Web application authors must not need to run their own speech service.* 13 2.04 4.14 R22. Web application author wants to provide a consistent user experience across all modalities.* 14 1.99 4.57 R24. End user should be able to use speech in a hands- free mode.* 15 1.89 4.71 R2. Application change from directed input to free form input.* 16 1.81 4.88 R10. Web application authors need to be able to use full SSML features.* 17 1.81 4.13 R11. Web application author must integrate input from multiple modalities.* 18 1.80 3.71 R13. Web application author should have ability to customize speech recognition graphical user interface.* 19 1.77 4.14 R9. Web application author provided synthesis feedback.* 20 1.70 2.29 R19. End user extensions should be available both on desktop and in cloud. 21 1.68 5.86 R32. End users need a clear indication whenever microphone is listening to the user. 22 1.62 5.57 R34. A trust relation is needed between end user and whatever is doing recognition. 23 1.60 2.83 R12. Web application author must be able to specify a domain specific statistical language model.* 24 1.60 3.38 R20. Web author selected TTS service should be available both on device and in the cloud.* 25 1.55 5.00 R25. It should be easy to extend the standard without affecting existing speech applications.* 26 1.51 5.50 R14. Web application authors need a way to specify and effectively create barge-in (interrupt audio and synthesis).* 27 1.13 6.43 R5. Web application must be notified when speech recognition errors and other non-matches occur.* 28 1.11 6.29 R7. Web application must be able to specify domain specific custom grammars.* 29 1.07 6.14 R8. Web application must be able to specify language of recognition.* 30 1.00 6.00 R6. Web application must be provided with full context of recognition.* 31 0.98 1.57 R21. Any public interface for creating extensions should be speakable. 32 0.98 6.43 R4. Web application must be notified when recognition occurs.* 33 0.95 6.29 R3. Ability to bind results to specific input fields.* 34 0.74 6.63 R27. Grammars, TTS, media composition, and recognition results should all use standard formats.* I recommend that we address these simultaneously from the top and the bottom. The top ones reflect scores that were not concentrated around one value (potentially reflecting strong disagreement about importance), while the requirements at the bottom are those with scores that were concentrated around one value (potentially reflecting general agreement about importance). With any luck the ones at the bottom will be quick to address and prioritize, while the ones at the top are likely the ones that will need the most discussion. I will shortly send follow-on emails for the first two requirements using this approach: 27 and 29. Please reply to each email with any questions or comments you have and any points you would like to make. After a week we'll see how close we are to consensus on each. The ideal outcome is one of the following two priorities: 1) Should Address, 2) For Future Consideration. Of course, if we need to revise, split, or otherwise modify a requirement we'll do so. -- dan
Received on Friday, 22 October 2010 00:06:00 UTC