Re: questionnaire results and my recommendations (v2) from Dan Burnett on 2011-01-27 (public-xg-htmlspeech@w3.org from January 2011)

From: Dan Burnett <dburnett@voxeo.com>
Date: Thu, 27 Jan 2011 13:46:16 -0500
To: "Young, Milan" <Milan.Young@nuance.com>
Cc: "Michael Bodell" <mbodell@microsoft.com>, <public-xg-htmlspeech@w3.org>
Message-Id: <1D1B8EE5-B1BA-4092-94D8-3149D89B864A@voxeo.com>
Milan, please re-read my actual text.  I believe I said "Proposals  
that support more of the "moderate interest" requirements are more  
likely to gain consensus.  Thus, it would be wise for proposals to  
support as many of these as possible."  Please note that at no point  
did I say that proposals should not or must not address any of the  
requirements.  I did also say that we can change the category names if  
there is concern with them; your email is the first to suggest  
possible concern.

I can tell you from experience that 50-70% interest in no way  
guarantees that the group as a whole will reach consensus on an item,  
although proposals that do not support such an item are not likely to  
gain consensus.  And consensus is what W3C and broadly-supported  
standards are about.

Along those lines, there is no plan to "vote" on the proposals.  W3C  
is a consensus-based organization, where the goal is to work together  
to produce something on which we can have consensus.  If you'll  
notice, I did not call the recent questionnaire a "vote", or require  
majority decision, or any such thing.  Instead, I very carefully  
described what we as group members can reasonably expect of our final  
proposals based upon the input from the prioritization survey.

Now, how are we going to reach consensus?  I am not describing how in  
advance.  That is not because I have a secret agenda on how to screw  
everyone over.  Rather, because I have been doing this for a very long  
time, I watch how the group operates, notice where we tend to have  
agreement and where not, and then suggest ways to proceed that will  
allow us, as a group, to reach consensus.  If you have concerns with  
my (and/or Michael's) ability to fairly and smoothly assist a group to  
reach consensus, please contact us individually.

-- dan

On Jan 27, 2011, at 12:27 PM, Young, Milan wrote:

> I was waiting for today’s call to clarify these results.  J
>
> In particular, I’m unsure about the low/medium/high priorities Dan  
> assigned.  It seems harsh  to assign a “moderate” rating to features  
> that 50-70% of us thought would be a necessary part of v1.
>
> Perhaps you could clarify how the voting on the proposals will  
> proceed.  Are we going to require a majority decision to approve?
>
> Thanks
> From: public-xg-htmlspeech-request@w3.org [mailto:public-xg-htmlspeech-request@w3.org 
> ] On Behalf Of Michael Bodell
> Sent: Thursday, January 27, 2011 7:25 AM
> To: Dan Burnett; public-xg-htmlspeech@w3.org
> Subject: RE: questionnaire results and my recommendations (v2)
>
> It sounds like there is no disagreement with this point so I will  
> update the requirements document to reflect these results.  Based on  
> our schedule from the face to face this concludes our prioritized  
> requirements which was to be done by the end of January (on track).   
> And with no one raising email objections about the recommendations  
> this means we do not need a conference call today to discuss this,  
> so the conference call is canceled.
>
> The next stage of our work is to collect proposals.  We have until  
> the end of February for this stage.  However, sooner is better as it  
> gives the group more time to read and digest the proposals and  
> possibly to discuss them over a call.  I'd suggest that all of the  
> proposals should evaluate themselves against the prioritized  
> requirements.  For each requirement I'd suggest categorizing it as  
> one of "The proposal clearly meets this requirement in a  
> straightforward way", "The proposal can meet the requirement, but it  
> may not be obvious or straightforward how this is done", "The  
> proposal meets most of the requirement", "The proposal meets a  
> little of the requirement", "The proposal doesn't meet any of the  
> requirement".  That categorization can then help drive some of our  
> discussions of the propsals.
>
>
> From: public-xg-htmlspeech-request@w3.org [public-xg-htmlspeech-request@w3.org 
> ] on behalf of Dan Burnett [dburnett@voxeo.com]
> Sent: Tuesday, January 25, 2011 5:38 AM
> To: public-xg-htmlspeech@w3.org
> Subject: questionnaire results and my recommendations (v2)
>
> Group,
>
> The questionnaire is now closed.  Looking at the results [1] and  
> then sorting by number of votes I see the following counts:
>
> 10 votes for 10 requirements
> 9 votes for 20 requirements
> 8 votes for 11 requirements
> 7 votes for 6 requirements
> 6 votes for 5 requirements
> 5 votes for 3 requirements
> 4 votes for 1 requirement
> 3 votes for 3 requirements
> 2 votes for 2 requirements
>
> Based on natural breakpoints and the fact that 5 votes is the  
> halfway point out of 10, I would suggest 8-10 votes represent  
> "strong interest" in the requirement, 5-7 votes represent "moderate  
> interest" in the requirement, and 0-4 votes represent "mild  
> interest" in the requirement.  The requirements are listed in these  
> categories at the end of this email.
>
>
> We can discuss and debate names for these different levels the next  
> time we have a call, but it seems to me that we can at least  
> conclude the following:
>
> 1) Proposals that fail to support the "strong interest" requirements  
> are unlikely to gain consensus.  Practically, then, as a group we  
> are likely to require any proposal to support these requirements.
>
> 2) Proposals that support more of the "moderate interest"  
> requirements are more likely to gain consensus.  Thus, it would be  
> wise for proposals to support as many of these as possible.
>
> 3) Gaining consensus to support any of the "mild interest"  
> requirements will be difficult at best.
>
>
> I recommend that we add a section to the requirements document that  
> references the questionnaire results and lists the requirements  
> grouped into these three different categories.  If there is  
> disagreement on the names of the categories we can discuss that on a  
> call.
>
> At this point I believe we are ready to consider proposals.  If you  
> disagree, please send email to the list and we can discuss.
>
>
>
>
>
>
>
>
>
> "Strong Interest" Requirements
> - FPR40. Web applications must be able to use barge-in (interrupting  
> audio and TTS output when the user starts speaking).
> - FPR4. It should be possible for the web application to get the  
> recognition results in a standard format such as EMMA.
> - FPR24. The web app should be notified when recognition results are  
> available.
> - FPR50. Web applications must not be prevented from integrating  
> input from multiple modalities.
> - FPR59. While capture is happening, there must be a way for the web  
> application to abort the capture and recognition process.
> - FPR52. The web app should be notified when TTS playback finishes.
> - FPR60. Web application must be able to programatically abort tts  
> output.
> - FPR38. Web application must be able to specify language of  
> recognition.
> - FPR45. Applications should be able to specify the grammars (or  
> lack thereof) separately for each recognition.
> - FPR1. Web applications must not capture audio without the user's  
> consent.
> - FPR19. User-initiated speech input should be possible.
> - FPR21. The web app should be notified that capture starts.
> - FPR22. The web app should be notified that speech is considered to  
> have started for the purposes of recognition.
> - FPR23. The web app should be notified that speech is considered to  
> have ended for the purposes of recognition.
> - FPR25. Implementations should be allowed to start processing  
> captured audio before the capture completes.
> - FPR26. The API to do recognition should not introduce unneeded  
> latency.
> - FPR34. Web application must be able to specify domain specific  
> custom grammars.
> - FPR35. Web application must be notified when speech recognition  
> errors or non-matches occur.
> - FPR42. It should be possible for user agents to allow hands-free  
> speech input.
> - FPR48. Web application author must be able to specify a domain  
> specific statistical language model.
> - FPR54. Web apps should be able to customize all aspects of the  
> user interface for speech recognition, except where such  
> customizations conflict with security and privacy requirements in  
> this document, or where they cause other security or privacy problems.
> - FPR51. The web app should be notified when TTS playback starts.
> - FPR53. The web app should be notified when the audio corresponding  
> to a TTS <mark> element is played back.
> - FPR5. It should be easy for the web appls to get access to the  
> most common pieces of recognition results such as utterance,  
> confidence, and nbests.
> - FPR39. Web application must be able to be notified when the  
> selected language is not available.
> - FPR13. It should be easy to assign recognition results to a single  
> input field.
> - FPR14. It should not be required to fill an input field every time  
> there is a recognition result.
> - FPR15. It should be possible to use recognition results to  
> multiple input fields.
> - FPR16. User consent should be informed consent.
> - FPR18. It must be possible for the user to revoke consent.
> - FPR11. If the web apps specify speech services, it should be  
> possible to specify parameters.
> - FPR12. Speech services that can be specified by web apps must  
> include network speech services.
> - FPR2. Implementations must support the XML format of SRGS and must  
> support SISR.
> - FPR27. Speech recognition implementations should be allowed to add  
> implementation specific information to speech recognition results.
> - FPR3. Implementation must support SSML.
> - FPR46. Web apps should be able to specify which voice is used for  
> TTS.
> - FPR7. Web apps should be able to request speech service different  
> from default.
> - FPR9. If browser refuses to use the web application requested  
> speech service, it must inform the web app.
> - FPR17. While capture is happening, there must be an obvious way  
> for the user to abort the capture and recognition process.
> - FPR37. Web application should be given captured audio access only  
> after explicit consent from the user.
> - FPR49. End users need a clear indication whenever microphone is  
> listening to the user
>
> "Moderate Interest" Requirements
> - FPR33. There should be at least one mandatory-to-support codec  
> that isn't encumbered with IP issues and has sufficient fidelity &  
> low bandwidth requirements.
> - FPR28. Speech recognition implementations should be allowed to  
> fire implementation specific events.
> - FPR41. It should be easy to extend the standard without affecting  
> existing speech applications.
> - FPR36. User agents must provide a default interface to control  
> speech recognition.
> - FPR44. Recognition without specifying a grammar should be possible.
> - FPR61. Aborting the TTS output should be efficient.
> - FPR32. Speech services that can be specified by web apps must  
> include local speech services.
> - FPR47. When speech input is used to provide input to a web app, it  
> should be possible for the user to select alternative input methods.
> - FPR56. Web applications must be able to request NL interpretation  
> based only on text input (no audio sent).
> - FPR30. Web applications must be allowed at least one form of  
> communication with a particular speech service that is supported in  
> all UAs.
> - FPR55. Web application must be able to encrypt communications to  
> remote speech service.
> - FPR58. Web application and speech services must have a means of  
> binding session information to communications.
> - FPR6. Browser must provide default speech resource.
> - FPR20. The spec should not unnecessarily restrict the UA's choice  
> in privacy policy.
>
> "Mild Interest" Requirements
> - FPR29. Speech synthesis implementations should be allowed to fire  
> implementation specific events.
> - FPR31. User agents and speech services may agree to use alternate  
> protocols for communication.
> - FPR43. User agents should not be required to allow hands-free  
> speech input.
> - FPR10. If browser uses speech services other than the default one,  
> it must inform the user which one(s) it is using.
> - FPR8. User agent (browser) can refuse to use requested speech  
> service.
> - FPR57. Web applications must be able to request recognition based  
> on previously sent audio.
>
>
>
> [1] http://www.w3.org/2002/09/wbs/45260/ReqPri02/results
Received on Thursday, 27 January 2011 18:46:54 UTC