- From: Glen Shires <gshires@google.com>
- Date: Mon, 23 Apr 2012 16:25:24 -0700
- To: "Young, Milan" <Milan.Young@nuance.com>
- Cc: Hans Wennborg <hwennborg@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>, Satish S <satish@google.com>
- Message-ID: <CAEE5bcis8GA5Qa8n+hWJnn8dOpD+=mju4kg2Ago+29Fnsk00+g@mail.gmail.com>
We're still thinking about the proposed refactoring of the SpeechRecognition result. I'll get back to you soon. On Mon, Apr 23, 2012 at 3:35 PM, Young, Milan <Milan.Young@nuance.com>wrote: > We’ve heard from Google and Nuance so far. Does anybody else have an > opinion on the following parameters? Would explicitly including them in > the API (as opposed to pushing them to a custom-parameter bin) generally > speed or slow adoption?**** > > float confidenceThreshold; > integer completeTimeout; > integer incompleteTimeout; > integer maxSpeechTimeout;**** > > ** ** > > Glen, glad to know we agree on the most important points. Did you also > want to comment on my proposal for refactoring the SpeechRecogniton result, > or should I take silence to mean agreement?**** > > ** ** > > Thanks**** > > ** ** > > ** ** > > *From:* Glen Shires [mailto:gshires@google.com] > *Sent:* Monday, April 23, 2012 12:34 PM > *To:* Young, Milan > > *Cc:* Hans Wennborg; public-speech-api@w3.org; Satish S > *Subject:* Re: Speech API: first editor's draft posted**** > > ** ** > > Milan,**** > > We're not planning on having regular conference calls, so I'd like to ask > the group to discuss this and other issues via email.**** > > ** ** > > Also, I apologize for not responding earlier, so I'll respond now...**** > > ** ** > > ** ** > > - SpeechParameterList parameters;**** > > - void setCustomParameter(in DOMString name, in DOMString value);**** > > ** ** > > I agree we need an attribute and method such as these for both > SpeechRecognition and TTS.**** > > ** ** > > I also suggest another method...**** > > ** ** > > - enumerateCustomParameter(DomString name)**** > > - It returns a list of valid DOMString values**** > > for setCustomParameter(name, value)**** > > ** ** > > - We may also want to specify some way that a**** > > numeric range could be returned**** > > ** ** > > - (I'm open to suggestions for a better name,**** > > I don't particularly like "enumerate")**** > > ** ** > > - I presume that SpeechParameterList contains**** > > all the valid names, so we don't need a method**** > > to enumerate them.**** > > ** ** > > One example where enumerateCustomParameter would be very useful would be > to select a TTS voice. For example, the following could return a list of > DOMStrings of all available voices:**** > > ** ** > > TTS.enumerateCustomParameter("voice")**** > > ** ** > > ** ** > > - maxNBest for SpeechRecognition**** > > I agree.**** > > ** ** > > - serviceUri for SpeechRecognition and TTS**** > > I agree.**** > > ** ** > > ** ** > > For this initial specification, we believe that a simplified API will > accelerate implementation, interoperability testing, standardization and > ultimately developer adoption. For this reason, we believe that timeout > parameters and confidenceThreshold should not be added to this initial spec > because:**** > > ** ** > > - They are not required for the majority of use cases.**** > > ** ** > > - They can be confusing for web developers, particularly those with little > speech experience. Often it's best for developers to rely on the default > values set by the speech recognition service.**** > > ** ** > > - Their definition and implementation may vary between different speech > service implementations.**** > > ** ** > > - Confidence is returned in the recognition results, so sophisticated > developers can compare and process relative confidence levels, which is > often more useful than a threshold, particularly because confidence value > definitions vary by speech services.**** > > ** ** > > - setCustomParameter can be used to set these parameters.**** > > ** ** > > /Glen Shires**** > > ** ** > > On Mon, Apr 23, 2012 at 10:56 AM, Young, Milan <Milan.Young@nuance.com> > wrote:**** > > Being new to Community Groups, I'm not clear on the plan for resolving > issues like that which I have posted below. Do I need to submit a concrete > counter-proposal? Should we have regular conference calls to discuss > this and the other issues that have come up? Perhaps a F2F to get us > started? > > Thanks**** > > > -----Original Message----- > From: Young, Milan [mailto:Milan.Young@nuance.com] > Sent: Friday, April 13, 2012 2:06 PM > To: Hans Wennborg; public-speech-api@w3.org > Cc: Satish S; Glen Shires**** > > Subject: RE: Speech API: first editor's draft posted > > Thank you for the draft, this looks like an excellent start. A few > comments/suggestions on the following: > > SpeechRecognition > - In addition to the three parameters you have listed, I see the > following as necessary: > integer maxNBest; > float confidenceThreshold; > integer completeTimeout; > integer incompleteTimeout; > integer maxSpeechTimeout; > attribute DOMString serviceURI; > > - We'll also need an interface for setting non-standard parameters. This > will be critical to avoid rat-holing into a complete list of parameters. > SpeechParameterList parameters; > void setCustomParameter(in DOMString name, in DOMString value); > > interface SpeechParameter { > attribute DOMString name; > attribute DOMString value; > }; > > interface SpeechParameterList { > readonly attribute unsigned long length; > getter SpeechParameter item(in unsigned long index); > }; > > - I prefer a flatter structure for SpeechRecogntion. Part of doing that > would involve splitting the error path out to its own event. I suggest the > following: > > // A full response, which could be interim or final, part of a > continuous response or not > interface SpeechRecognitionResult : RecognitionEvent { > readonly attribute unsigned long length; > getter SpeechRecognitionAlternative item(in unsigned long index); > readonly attribute boolean final; > readonly attribute short resultIndex; > readonly attribute SpeechRecognitionResultList resultHistory; > }; > > interface SpeechRecognitionError : RecognitionEvent { > // As before > }; > > > > TTS > - At a minimum, we'll need the same serviceURI parameter and generic > parameter interface as in SpeechRecognition. > - I'd also like to hear some discussion on the importance of "marking" > the stream. I personally feel this is common enough that I should be part > of a v1. > > > Thanks > > > -----Original Message----- > From: Hans Wennborg [mailto:hwennborg@google.com] > Sent: Thursday, April 12, 2012 7:36 AM > To: public-speech-api@w3.org > Cc: Satish S; Glen Shires > Subject: Speech API: first editor's draft posted > > In December, Google proposed [1] to public-webapps a Speech JavaScript API > that subset supports the majority of the use-cases in the Speech Incubator > Group's Final Report. This proposal provides a programmatic API that > enables web-pages to synthesize speech output and to use speech recognition > as an input for forms, continuous dictation and control. > > We have now posted in the Speech-API Community Group's repository, a > slightly updated proposal [2], the differences include: > > - Document is now self-contained, rather than having multiple references > to the XG Final Report. > - Renamed SpeechReco interface to SpeechRecognition > - Renamed interfaces and attributes beginning SpeechInput* to > SpeechRecognition* > - Moved EventTarget to constructor of SpeechRecognition > - Clarified that grammars and lang are attributes of SpeechRecognition > - Clarified that if index is greater than or equal to length, returns null > > We welcome discussion and feedback on this editor's draft. Please send > your comments to the public-speech-api@w3.org mailing list. > > Glen Shires > Hans Wennborg > > [1] > http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1696.html > [2] http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html > > **** > > > > **** > > ** ** > > -- > Thanks!**** > > Glen Shires**** > > ** ** > -- Thanks! Glen Shires
Received on Monday, 23 April 2012 23:26:34 UTC