HTML Speech Incubator Group Agenda

Lyon TPAC, 2 and 4 November 2010

Tuesday, 2 November 2010

Room: Saint Clair 3A (Level 2 -- Saint Clair)

1400-1530 Session 1 -- minutes by ??

1530-1600 Break

1600-1800 Session 2 -- minutes by ??


Thursday, 4 November 2010

Room: Saint Clair 2 (Level 2 -- Saint Clair)

0830-1030 Session 1 -- minutes by ??

Requirements (continued)

1030-1100 Break

1100-1230 Session 2 -- minutes by ??


Requirements

The requirements and their ranks were taken directly from the earlier group email.
Rank STDEV MEAN Requirement
1 2.89 4.00 R29. Web application may only listen in response to user action.
2 2.88 4.67 R31. End users, not web application authors, should be the ones to select speech recognition resources.
3 2.51 5.57 R33. User agents need a way to enable end users to grant permission to an application to listen to them.
4 2.51 4.50 R1. Web author needs full control over specification of speech resources.*
5 2.43 4.25 R18. User perceived latency of synthesis must be minimized.*
6 2.38 5.00 R17. User perceived latency of recognition must be minimized.*
7 2.27 4.14 R30. End users should not be forced to store anything about their speech recognition environment in the cloud.
8 2.27 3.14 R28. Web application must not be allowed access to raw audio.
9 2.19 4.75 R16. Web application authors must not be excluded from running their own speech service.*
10 2.15 3.57 R26. There should exist a high quality default speech recognition visual user interface.*
11 2.14 5.71 R23. Speech as an input on any application should be able to be optional.*
12 2.14 4.71 R15. Web application authors must not need to run their own speech service.*
13 2.04 4.14 R22. Web application author wants to provide a consistent user experience across all modalities.*
14 1.99 4.57 R24. End user should be able to use speech in a hands-free mode.*
15 1.89 4.71 R2. Application change from directed input to free form input.*
16 1.81 4.88 R10. Web application authors need to be able to use full SSML features.*
17 1.81 4.13 R11. Web application author must integrate input from multiple modalities.*
18 1.80 3.71 R13. Web application author should have ability to customize speech recognition graphical user interface.*
19 1.77 4.14 R9. Web application author provided synthesis feedback.*
20 1.70 2.29 R19. End user extensions should be available both on desktop and in cloud.
21 1.68 5.86 R32. End users need a clear indication whenever microphone is listening to the user.
22 1.62 5.57 R34. A trust relation is needed between end user and whatever is doing recognition.
23 1.60 2.83 R12. Web application author must be able to specify a domain specific statistical language model.*
24 1.60 3.38 R20. Web author selected TTS service should be available both on device and in the cloud.*
25 1.55 5.00 R25. It should be easy to extend the standard without affecting existing speech applications.*
26 1.51 5.50 R14. Web application authors need a way to specify and effectively create barge-in (interrupt audio and synthesis).*
27 1.13 6.43 R5. Web application must be notified when speech recognition errors and other non-matches occur.*
28 1.11 6.29 R7. Web application must be able to specify domain specific custom grammars.*
29 1.07 6.14 R8. Web application must be able to specify language of recognition.*
30 1.00 6.00 R6. Web application must be provided with full context of recognition.*
31 0.98 1.57 R21. Any public interface for creating extensions should be speakable.
32 0.98 6.43 R4. Web application must be notified when recognition occurs.*
33 0.95 6.29 R3. Ability to bind results to specific input fields.*
34 0.74 6.63 R27. Grammars, TTS, media composition, and recognition results should all use standard formats.*

Last modified: Sat Oct 30 21:46:08 EDT 2010