- From: Glen Shires <gshires@google.com>
- Date: Mon, 17 Sep 2012 22:25:42 -0700
- To: Nagesh Kharidi <nagesh@openstream.com>
- Cc: Jim Barnett <Jim.Barnett@genesyslab.com>, Dominic Mazzoni <dmazzoni@google.com>, Hans Wennborg <hwennborg@google.com>, olli@pettay.fi, public-speech-api@w3.org
- Message-ID: <CAEE5bcig0+Na_h0ZSYs6Xvj1j4Y1ezqoFfOh0qr8ieuvyaCJaQ@mail.gmail.com>
I've updated the spec with the above SpeechSynthesis and SpeechSynthesisUtterance IDL and definitions: https://dvcs.w3.org/hg/speech-api/rev/b036c78e9445 As always, the current draft spec is at: http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html On Sat, Sep 15, 2012 at 1:31 PM, Glen Shires <gshires@google.com> wrote: > Nagesh, > I agree that cancelAll() is useful and can make code simpler because it > doesn't affect the paused state. In fact, I propose that we add > cancelAll() and remove stop() -- because the stop function is probably less > common and can easily be accomplished with two calls: cancelAll() and > pause(). > > Also, since canceling a specific utterance is not very useful, and questionable > as Jerry states, I propose eliminating cancel(utterance). If we do that, > then we could rename cancelAll() more simply as cancel(). > > Thus, I propose this IDL: > > interface SpeechSynthesis { > static readonly attribute boolean pending; > static readonly attribute boolean speaking; > static readonly attribute boolean paused; > > static void speak(SpeechSynthesisUtterance utterance); > static void cancel(); > static void pause(); > static void continue(); > } > > and I propose this new definition of cancel: > > The cancel method > This method removes all utterances from the queue. If an utterance is > being spoken, speaking ceases immediately. This method does not change the > paused state of the SpeechSynthesis object. > > /Glen Shires > > > On Sat, Sep 15, 2012 at 3:25 AM, Nagesh Kharidi <nagesh@openstream.com>wrote: > >> Please see inline. >> >> Regards, >> Nagesh >> >> On Fri, 14 Sep 2012 12:59:41 -0700 >> Glen Shires <gshires@google.com> wrote: >> >> Provide the ability to cancel all currently queued utterances. >> > >> >The stop() method cancels all queued utterances. (Dominic proposed >> >that >> >this message be named stopAndFlushQueue(), would that name be more >> >clear?) >> >> In addition to canceling all queued utterances, the stop() method also >> pauses the SpeechSynthesis object. A separate cancelAll() method would >> be useful, without which, if a new utterance is to be spoken >> immediately, we would have to do : >> speechSynthesis.stop(); >> speechSynthesis.continue(); >> speechSynthesis.speak(utterance); >> >> With a cancelAll() method, this would be: >> speechSynthesis.cancelAll(); >> speechSynthesis.speak(utterance); >> >> Since this would be such a common usage, we could make it even easier >> for developers by either: >> - providing a speakImmediate(utterance) method that cancels all queued >> utterances and then starts speaking the new utterance >> or >> - adding a second parameter as follows to the speak() method: >> speechSynthesis.speak(utterance, speakImmediately); >> If speakImmediately is true, all currently queued utterances will be >> canceled and the new utterance will be spoken. >> >> > >> >Also, what is the use case for the current cancel(utterance) method? >> > In >> >all the use cases I envision, you'd want to cancel all queued >> >utterances. >> >Can we eliminate cancel() ? >> >> I also agree that canceling a specific utterance is not very useful. >> Canceling all queued utterances would be more common than canceling a >> specific utterance. >> >> > >> > >> >> New speakNext SpeechSynthesis method - append the utterance to the >> >beginning of the queue >> > >> >I'd like more discussion on this. What are the use cases? What are the >> >edge >> >cases (e.g. If there's a race-condition, the current utterance may >> >finish >> >and the second in the queue may begin speaking before this new >> >utterance is >> >inserted). >> >> Use case for speakNext() method: Consider a news application that plays >> the latest news items. It queues all news items to be played. Now if >> there is a new "breaking news" item that comes in, the speakNext() >> method can be used to play it as soon as possible without canceling the >> already queued items. >> >> >> > >> > >> >> Question: Can a cancelled utterance be re-queued? >> > >> >Good question, and also, what is the lifetime of a >> >SpeechSynthesisUtterance >> >object and who owns it. There's at least 3 possibilities: >> > >> >1. The speak() method takes ownership when it adds it to the queue, >> >then it >> >would presumably be destroyed upon cancel or onend. >> > (This raises the questions: what usefulness is >> >the SpeechSynthesisUtterance object attribute "ended", since the >> >object >> >will be destroyed when it turns true. It also makes it messy to use >> >the >> >other readonly attributes because the object may be deleted suddenly. >> >Also, what if the author deletes the SpeechSynthesisUtterance object >> >prior >> >to it being spoken. One easy way to accidentally create this bug is >> >to >> >define the SpeechSynthesisUtterance object in a method that goes out >> >of >> >scope.) >> > >> >2. The speak() method does not take ownership when it adds it directly >> >to >> >queue. >> > (This raises the question: what if the author deletes the >> >SpeechSynthesisUtterance object prior to it being spoken. One easy >> >way to >> >accidentally create this bug is to define the SpeechSynthesisUtterance >> >object in a method that goes out of scope.) >> > >> >3. The speak() method does not take ownership, it makes a copy of it >> >when >> >it adds it to queue . >> > (This raises the question: how can the author's >> >original SpeechSynthesisUtterance object readonly attributes >> >(speaking, >> >paused, ended) reflect the state of the copy on the queue.) >> > >> > >> >To resolve these issues, I propose the following, because I think it's >> >the >> >cleanest solution and easiest for authors, since they can create and >> >destroy objects, and go out of scope, without worrying about the >> >speaking >> >queue timing: >> > >> >The speak() method does not take ownership of the >> >SpeechSynthesisUtterance >> >object, it makes a copy of it when it adds it to queue. We eliminate >> >the SpeechSynthesisUtterance readonly attributes, relying instead on >> >events >> >that indicate change in state, including new events for: onpause, >> >onresume. >> > >> >Because it's a copy of the object, this clarifies that: >> >- changes to the original SpeechSynthesisUtterance object after >> >calling >> >speak() do not affect the copy on the queue. >> >- the same SpeechSynthesisUtterance object can be used to call speak() >> >multiple times, (even after a copy of which was spoken or cancelled). >> > >> >The new IDL would be: >> > >> > interface SpeechSynthesisUtterance { >> > attribute DOMString text; >> > attribute DOMString lang; >> > attribute DOMString serviceURI; >> > >> > attribute Function onstart; >> > attribute Function onend; >> >* attribute Function onpause;* >> >* attribute Function onresume;* >> > } >> > >> > >> >And the new definition: >> > >> >The speak method >> >This method appends *a copy of* the utterance to the end of the queue >> >for >> >this SpeechSynthesis object. It does not change the paused state of >> >the >> >SpeechSynthesis object. If the SpeechSynthesis object is paused, it >> >remains paused. If it is not paused, then this utterance is spoken if >> >no >> >other utterances are in the queue, else this utterance is queued to >> >begin >> >speaking after the other utterances in the queue have been spoken. >> > >> > >> >/Glen Shires >> > >> > >> >On Fri, Sep 14, 2012 at 6:05 AM, Jim Barnett >> ><Jim.Barnett@genesyslab.com>wrote: >> > >> >> I would think that cancelling all utterances would be the more >> >common use >> >> case (so we ought to make it easy). Question: Can a cancelled >> >utterance >> >> be re-queued? >> >> >> >> - Jim >> >> >> >> -----Original Message----- >> >> From: Nagesh Kharidi [mailto:nagesh@openstream.com] >> >> Sent: Friday, September 14, 2012 8:58 AM >> >> To: Glen Shires; Dominic Mazzoni >> >> Cc: Hans Wennborg; olli@pettay.fi; public-speech-api@w3.org >> >> Subject: Re: TTS proposal to split Utterance into its own interface >> >> >> >> I would like to propose the following: >> >> 1. Provide the ability to cancel all currently queued utterances. A >> >new >> >> cancelAll method could be added. Alternately, invoking the cancel >> >method >> >> without the utterance parameter could imply cancel all utterances. >> >> >> >> 2. New speakNext SpeechSynthesis method >> >> This method will append the utterance to the beginning of the queue. >> >> >> >> 3. New oncancel SpeechSynthesisUtterance event Fired when the >> >utterance is >> >> canceled. >> >> >> >> 4. New canceled SpeechSynthesisUtterance attribute true if the >> >utterance >> >> is canceled. >> >> >> >> >> >> I also had a question regarding the stop method: Is "flushes the >> >queue" >> >> equivalent to calling cancel on all utterances in the queue? If so, >> >I >> >> would like to suggest changing "flushes the queue" to "cancels all >> >> utterances in the queue". >> >> >> >> Regards, >> >> Nagesh >> >> >> >> On Thu, 13 Sep 2012 14:13:56 -0700 >> >> Glen Shires <gshires@google.com> wrote: >> >> >Yes, I like the way you've defined the "speak" method to not change >> >the >> >> >play/pause state. Also, I didn't particularly like the word >> >"playback", >> >> >so thanks for the alternative "spoken". Here's updated definitions >> >> >with your suggestions incorporated. If there's no disagreement, >> >I'll >> >> >add them to the spec on Monday. >> >> > >> >> > >> >> >SpeechSynthesis Attributes >> >> > >> >> >pending attribute: >> >> >This attribute is true if the queue for this SpeechSynthesis object >> >> >contains any utterances which have not started speaking. >> >> > >> >> >speaking attribute: >> >> >This attribute is true if an utterance is being spoken. >> >Specifically if >> >> >an utterance has begun being spoken and has not completed being >> >spoken, >> >> >and is independent of whether this SpeechSynthesis object is in the >> >> >paused state. >> >> > >> >> >paused attribute: >> >> >The attribute is true when this SpeechSynthesis object is in the >> >paused >> >> >state. This state is independent of whether anything is in the >> >queue. >> >> >The >> >> >default state of a new SpeechSynthesis object is the non-paused >> >state. >> >> > >> >> > >> >> >SpeechSynthesis Methods >> >> > >> >> >The speak method >> >> >This method appends the utterance to the end of the queue for this >> >> >SpeechSynthesis object. It does not change the paused state of the >> >> >SpeechSynthesis object. If the SpeechSynthesis object is paused, >> >it >> >> >remains paused. If it is not paused, then this utterance is spoken >> >if >> >> >no other utterances are in the queue, else this utterance is queued >> >to >> >> >begin speaking after the other utterances in the queue have been >> >> >spoken. >> >> > >> >> >The cancel method >> >> >This method removes the specified utterance from the queue. If it >> >is >> >> >not in the queue, no changes are made. If the utterance removed is >> >> >being spoken, speaking ceases for that utterance and the next >> >utterance >> >> >in the queue (if >> >> >any) begins to be spoken. This method does not change the paused >> >state >> >> >of the SpeechSynthesis object. >> >> > >> >> >The pause method >> >> >This method puts the SpeechSynthesis object into the paused state. >> >If >> >> >an utterance was being spoken, it pauses mid-utterance. (If called >> >when >> >> >the SpeechSynthesis object was already in the paused state, it does >> >> >nothing.) >> >> > >> >> >The continue method >> >> >This method puts the SpeechSynthesis object into the non-paused >> >state. >> >> >If >> >> >an utterance was speaking (that is, its speaking attribute is >> >true), it >> >> >continues speaking the utterance at the point at which it was >> >paused, >> >> >else it begins speaking the next utterance in the queue (if any). >> >(If >> >> >called when the SpeechSynthesis object was already in the >> >non-paused >> >> >state, it does nothing.) >> >> > >> >> >The stop method. >> >> >This method puts the SpeechSynthesis object into the paused state >> >and >> >> >flushes the queue. It sets the speaking attribute to false and the >> >> >paused attribute to true. >> >> > >> >> > >> >> >SpeechSynthesisUtterance attributes >> >> > >> >> > >> >> >[[Note, I used SHOULD here because there may be some race-condition >> >> >edge-cases where it might not be ignored.]] >> >> > >> >> >text attribute: >> >> >The text to be synthesized for this utterance. Changes to this >> >> >attribute after the utterance has been added to the queue (by >> >calling >> >> >the speak >> >> >method) SHOULD be ignored. >> >> > >> >> >lang attribute: >> >> >[no change except to append the following] Changes to this >> >attribute >> >> >after the utterance has been added to the queue (by calling the >> >speak >> >> >method) >> >> >SHOULD be ignored. >> >> > >> >> >serviceURI attribute: >> >> >[no change except to append the following] Changes to this >> >attribute >> >> >after the utterance has been added to the queue (by calling the >> >speak >> >> >method) >> >> >SHOULD be ignored. >> >> > >> >> >speaking attribute: >> >> >This attribute is true if this specific utterance is currently >> >being >> >> >spoken. Specifically if this utterance has begun being spoken and >> >has >> >> >not completed being spoken. This is independent of whether the >> >> >SpeechSynthesis object is in a paused state. >> >> > >> >> >paused attribute: >> >> >This attribute is true if this specific utterance has begun to be >> >> >spoken, but has not completed and the SpeechSynthesis object is in >> >the >> >> >paused state. >> >> > >> >> >ended attribute: >> >> >This attribute is true if this specific utterance has completed >> >being >> >> >spoken. >> >> > >> >> >SpeechSynthesisUtterance events >> >> > >> >> >onstart event: >> >> >Fired when this utterance has begun to be spoken. >> >> > >> >> >onend event: >> >> >Fired when this utterance has completed being spoken. >> >> > >> >> > >> >> > >> >> >On Thu, Sep 13, 2012 at 10:25 AM, Dominic Mazzoni >> >> ><dmazzoni@google.com>wrote: >> >> > >> >> >> Thanks for proposing definitions. >> >> >> >> >> >> On Tue, Sep 11, 2012 at 3:02 AM, Glen Shires <gshires@google.com> >> >> >wrote: >> >> >> > I propose the following definitions for the SpeechSynthesis >> >IDL: >> >> >> > >> >> >> > SpeechSynthesis Attributes >> >> >> > >> >> >> > pending attribute: >> >> >> > This attribute is true if the queue contains any utterances >> >which >> >> >have >> >> >> not >> >> >> > completed playback. >> >> >> >> >> >> I was imagining: This attribute is true if the queue contains any >> >> >> utterances which have not *started* speaking. >> >> >> >> >> >> > speaking attribute: >> >> >> > This attribute is true if playback is in progress. >> >> >> >> >> >> I don't like the word "playback", it doesn't fit when the speech >> >is >> >> >> generated dynamically. How about: This attribute is true if an >> >> >> utterance is being spoken. >> >> >> >> >> >> > paused attribute: >> >> >> > **** How is this different than (pending && !speaking) ? **** >> >> >> >> >> >> This is true if the speech synthesis system is in a paused state, >> >> >> independent of whether anything is speaking or queued. >> >> >> >> >> >> paused && speaking -> it was paused in the middle of an utterance >> >> >> paused && !speaking -> no utterance is speaking, but if you call >> >> >> speak(), nothing will happen because it's in a paused state. >> >> >> >> >> >> > >> >> >> > SpeechSynthesis Methods >> >> >> > >> >> >> > The speak method >> >> >> > This method appends the utterance to the end of a playback >> >queue. >> >> >If >> >> >> > playback is not in progress, it also begins playback of the >> >next >> >> >item in >> >> >> the >> >> >> > queue. >> >> >> >> >> >> What do you think about rewriting to not use "playback"? >> >> >> >> >> >> Also, my idea was that it would not begin playback if the system >> >is >> >> >in >> >> >> a paused state. >> >> >> >> >> >> > The cancel method >> >> >> > This method removes the first matching utterance (if any) from >> >the >> >> >> playback >> >> >> > queue. If playback is in progress and the utterance removed is >> >> >being >> >> >> played, >> >> >> > playback ceases for the utterance and the next utterance in the >> >> >queue (if >> >> >> > any) begins playing. >> >> >> >> >> >> Do we need to say "first matching"? Each utterance should be a >> >> >> specific object, it should be either in the queue or not. >> >> >> >> >> >> > The pause method >> >> >> > This method pauses the playback mid-utterance. If playback is >> >not >> >> >in >> >> >> > progress, it does nothing. >> >> >> >> >> >> I was assuming that calling it would set the system into a paused >> >> >> state, so that even a subsequent call to speak() would not do >> >> >anything >> >> >> other than enqueue. >> >> >> >> >> >> > The continue method >> >> >> > This method continues the playback at the point in the >> >utterance >> >> >and >> >> >> queue >> >> >> > in which it was paused. If playback is in progress, it does >> >> >nothing. >> >> >> > >> >> >> > The stop method. >> >> >> > This method stops playback mid-utterance and flushes the queue. >> >> >> > >> >> >> > >> >> >> > SpeechSynthesisUtterance attributes >> >> >> > >> >> >> > text attribute: >> >> >> > The text to be synthesized for this utterance. This attribute >> >must >> >> >not be >> >> >> > changed after onstart fires. >> >> >> >> >> >> I'd say: changes to this attribute after the utterance has been >> >> >added >> >> >> to the queue (by calling "speak") will be ignored. OR, we should >> >> >make >> >> >> it a DOM exception to modify it when it's in the speech queue. >> >> >> >> >> >> > paused attribute: >> >> >> > This attribute is true if this specific utterance is in the >> >queue >> >> >and has >> >> >> > not completed playback. >> >> >> >> >> >> I think this should only be true if it has begin speaking but not >> >> >> completed. >> >> >> >> >> >> - Dominic >> >> >> >> >> >> >> -- >> >> NOTICE TO RECIPIENT: >> >> THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE >> >TRANSMISSION, >> >> AND MAY BE A COMMUNICATION PRIVILEGED BY LAW. IF YOU RECEIVED THIS >> >E-MAIL >> >> IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING >> >OF THIS >> >> E-MAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE >> >ERROR >> >> BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM. >> >THANK YOU >> >> IN ADVANCE FOR YOUR COOPERATION. >> >> Reply to : legal@openstream.com >> >> >> >> >> >> >> >> >> >> -- >> NOTICE TO RECIPIENT: >> THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE >> TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGED BY LAW. IF YOU >> RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION, >> DISTRIBUTION, OR COPYING OF THIS E-MAIL IS STRICTLY PROHIBITED. PLEASE >> NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND PLEASE DELETE THIS >> MESSAGE FROM YOUR SYSTEM. THANK YOU IN ADVANCE FOR YOUR COOPERATION. >> Reply to : legal@openstream.com >> >> >
Received on Tuesday, 18 September 2012 05:26:52 UTC