- From: Glen Shires <gshires@google.com>
- Date: Tue, 18 Sep 2012 12:47:02 -0700
- To: Dominic Mazzoni <dmazzoni@google.com>
- Cc: Nagesh Kharidi <nagesh@openstream.com>, Jim Barnett <Jim.Barnett@genesyslab.com>, Hans Wennborg <hwennborg@google.com>, olli@pettay.fi, public-speech-api@w3.org
- Message-ID: <CAEE5bcguuY3-oRDtWC=WW9sH8DsW=np2fmn=fV3=OnCwKWugNQ@mail.gmail.com>
I've updated the spec with the above clarification: references to "SpeechSynthesis object" are now "global SpeechSynthesis instance". https://dvcs.w3.org/hg/speech-api/rev/bf779b363c93 As always, the current draft spec is at: http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html /Glen Shires On Tue, Sep 18, 2012 at 12:05 AM, Dominic Mazzoni <dmazzoni@google.com>wrote: > Looking good. Just one suggestion: how about replacing "this > SpeechSynthesis object" with "the global SpeechSynthesis instance" or > something that indicates there's just a single global SpeechSynthesis. > > - Dominic > > > On Mon, Sep 17, 2012 at 10:25 PM, Glen Shires <gshires@google.com> wrote: > > I've updated the spec with the above SpeechSynthesis and > > SpeechSynthesisUtterance IDL and definitions: > > https://dvcs.w3.org/hg/speech-api/rev/b036c78e9445 > > > > As always, the current draft spec is at: > > http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html > > > > On Sat, Sep 15, 2012 at 1:31 PM, Glen Shires <gshires@google.com> wrote: > >> > >> Nagesh, > >> I agree that cancelAll() is useful and can make code simpler because it > >> doesn't affect the paused state. In fact, I propose that we add > cancelAll() > >> and remove stop() -- because the stop function is probably less common > and > >> can easily be accomplished with two calls: cancelAll() and pause(). > >> > >> Also, since canceling a specific utterance is not very useful, and > >> questionable as Jerry states, I propose eliminating cancel(utterance). > If we > >> do that, then we could rename cancelAll() more simply as cancel(). > >> > >> Thus, I propose this IDL: > >> > >> interface SpeechSynthesis { > >> static readonly attribute boolean pending; > >> static readonly attribute boolean speaking; > >> static readonly attribute boolean paused; > >> > >> static void speak(SpeechSynthesisUtterance utterance); > >> static void cancel(); > >> static void pause(); > >> static void continue(); > >> } > >> > >> and I propose this new definition of cancel: > >> > >> The cancel method > >> This method removes all utterances from the queue. If an utterance is > >> being spoken, speaking ceases immediately. This method does not change > the > >> paused state of the SpeechSynthesis object. > >> > >> /Glen Shires > >> > >> > >> On Sat, Sep 15, 2012 at 3:25 AM, Nagesh Kharidi <nagesh@openstream.com> > >> wrote: > >>> > >>> Please see inline. > >>> > >>> Regards, > >>> Nagesh > >>> > >>> On Fri, 14 Sep 2012 12:59:41 -0700 > >>> Glen Shires <gshires@google.com> wrote: > >>> >> Provide the ability to cancel all currently queued utterances. > >>> > > >>> >The stop() method cancels all queued utterances. (Dominic proposed > >>> >that > >>> >this message be named stopAndFlushQueue(), would that name be more > >>> >clear?) > >>> > >>> In addition to canceling all queued utterances, the stop() method also > >>> pauses the SpeechSynthesis object. A separate cancelAll() method would > >>> be useful, without which, if a new utterance is to be spoken > >>> immediately, we would have to do : > >>> speechSynthesis.stop(); > >>> speechSynthesis.continue(); > >>> speechSynthesis.speak(utterance); > >>> > >>> With a cancelAll() method, this would be: > >>> speechSynthesis.cancelAll(); > >>> speechSynthesis.speak(utterance); > >>> > >>> Since this would be such a common usage, we could make it even easier > >>> for developers by either: > >>> - providing a speakImmediate(utterance) method that cancels all queued > >>> utterances and then starts speaking the new utterance > >>> or > >>> - adding a second parameter as follows to the speak() method: > >>> speechSynthesis.speak(utterance, speakImmediately); > >>> If speakImmediately is true, all currently queued utterances will be > >>> canceled and the new utterance will be spoken. > >>> > >>> > > >>> >Also, what is the use case for the current cancel(utterance) method? > >>> > In > >>> >all the use cases I envision, you'd want to cancel all queued > >>> >utterances. > >>> >Can we eliminate cancel() ? > >>> > >>> I also agree that canceling a specific utterance is not very useful. > >>> Canceling all queued utterances would be more common than canceling a > >>> specific utterance. > >>> > >>> > > >>> > > >>> >> New speakNext SpeechSynthesis method - append the utterance to the > >>> >beginning of the queue > >>> > > >>> >I'd like more discussion on this. What are the use cases? What are the > >>> >edge > >>> >cases (e.g. If there's a race-condition, the current utterance may > >>> >finish > >>> >and the second in the queue may begin speaking before this new > >>> >utterance is > >>> >inserted). > >>> > >>> Use case for speakNext() method: Consider a news application that plays > >>> the latest news items. It queues all news items to be played. Now if > >>> there is a new "breaking news" item that comes in, the speakNext() > >>> method can be used to play it as soon as possible without canceling the > >>> already queued items. > >>> > >>> > >>> > > >>> > > >>> >> Question: Can a cancelled utterance be re-queued? > >>> > > >>> >Good question, and also, what is the lifetime of a > >>> >SpeechSynthesisUtterance > >>> >object and who owns it. There's at least 3 possibilities: > >>> > > >>> >1. The speak() method takes ownership when it adds it to the queue, > >>> >then it > >>> >would presumably be destroyed upon cancel or onend. > >>> > (This raises the questions: what usefulness is > >>> >the SpeechSynthesisUtterance object attribute "ended", since the > >>> >object > >>> >will be destroyed when it turns true. It also makes it messy to use > >>> >the > >>> >other readonly attributes because the object may be deleted suddenly. > >>> >Also, what if the author deletes the SpeechSynthesisUtterance object > >>> >prior > >>> >to it being spoken. One easy way to accidentally create this bug is > >>> >to > >>> >define the SpeechSynthesisUtterance object in a method that goes out > >>> >of > >>> >scope.) > >>> > > >>> >2. The speak() method does not take ownership when it adds it directly > >>> >to > >>> >queue. > >>> > (This raises the question: what if the author deletes the > >>> >SpeechSynthesisUtterance object prior to it being spoken. One easy > >>> >way to > >>> >accidentally create this bug is to define the SpeechSynthesisUtterance > >>> >object in a method that goes out of scope.) > >>> > > >>> >3. The speak() method does not take ownership, it makes a copy of it > >>> >when > >>> >it adds it to queue . > >>> > (This raises the question: how can the author's > >>> >original SpeechSynthesisUtterance object readonly attributes > >>> >(speaking, > >>> >paused, ended) reflect the state of the copy on the queue.) > >>> > > >>> > > >>> >To resolve these issues, I propose the following, because I think it's > >>> >the > >>> >cleanest solution and easiest for authors, since they can create and > >>> >destroy objects, and go out of scope, without worrying about the > >>> >speaking > >>> >queue timing: > >>> > > >>> >The speak() method does not take ownership of the > >>> >SpeechSynthesisUtterance > >>> >object, it makes a copy of it when it adds it to queue. We eliminate > >>> >the SpeechSynthesisUtterance readonly attributes, relying instead on > >>> >events > >>> >that indicate change in state, including new events for: onpause, > >>> >onresume. > >>> > > >>> >Because it's a copy of the object, this clarifies that: > >>> >- changes to the original SpeechSynthesisUtterance object after > >>> >calling > >>> >speak() do not affect the copy on the queue. > >>> >- the same SpeechSynthesisUtterance object can be used to call speak() > >>> >multiple times, (even after a copy of which was spoken or cancelled). > >>> > > >>> >The new IDL would be: > >>> > > >>> > interface SpeechSynthesisUtterance { > >>> > attribute DOMString text; > >>> > attribute DOMString lang; > >>> > attribute DOMString serviceURI; > >>> > > >>> > attribute Function onstart; > >>> > attribute Function onend; > >>> >* attribute Function onpause;* > >>> >* attribute Function onresume;* > >>> > } > >>> > > >>> > > >>> >And the new definition: > >>> > > >>> >The speak method > >>> >This method appends *a copy of* the utterance to the end of the queue > >>> >for > >>> >this SpeechSynthesis object. It does not change the paused state of > >>> >the > >>> >SpeechSynthesis object. If the SpeechSynthesis object is paused, it > >>> >remains paused. If it is not paused, then this utterance is spoken if > >>> >no > >>> >other utterances are in the queue, else this utterance is queued to > >>> >begin > >>> >speaking after the other utterances in the queue have been spoken. > >>> > > >>> > > >>> >/Glen Shires > >>> > > >>> > > >>> >On Fri, Sep 14, 2012 at 6:05 AM, Jim Barnett > >>> ><Jim.Barnett@genesyslab.com>wrote: > >>> > > >>> >> I would think that cancelling all utterances would be the more > >>> >common use > >>> >> case (so we ought to make it easy). Question: Can a cancelled > >>> >utterance > >>> >> be re-queued? > >>> >> > >>> >> - Jim > >>> >> > >>> >> -----Original Message----- > >>> >> From: Nagesh Kharidi [mailto:nagesh@openstream.com] > >>> >> Sent: Friday, September 14, 2012 8:58 AM > >>> >> To: Glen Shires; Dominic Mazzoni > >>> >> Cc: Hans Wennborg; olli@pettay.fi; public-speech-api@w3.org > >>> >> Subject: Re: TTS proposal to split Utterance into its own interface > >>> >> > >>> >> I would like to propose the following: > >>> >> 1. Provide the ability to cancel all currently queued utterances. A > >>> >new > >>> >> cancelAll method could be added. Alternately, invoking the cancel > >>> >method > >>> >> without the utterance parameter could imply cancel all utterances. > >>> >> > >>> >> 2. New speakNext SpeechSynthesis method > >>> >> This method will append the utterance to the beginning of the queue. > >>> >> > >>> >> 3. New oncancel SpeechSynthesisUtterance event Fired when the > >>> >utterance is > >>> >> canceled. > >>> >> > >>> >> 4. New canceled SpeechSynthesisUtterance attribute true if the > >>> >utterance > >>> >> is canceled. > >>> >> > >>> >> > >>> >> I also had a question regarding the stop method: Is "flushes the > >>> >queue" > >>> >> equivalent to calling cancel on all utterances in the queue? If so, > >>> >I > >>> >> would like to suggest changing "flushes the queue" to "cancels all > >>> >> utterances in the queue". > >>> >> > >>> >> Regards, > >>> >> Nagesh > >>> >> > >>> >> On Thu, 13 Sep 2012 14:13:56 -0700 > >>> >> Glen Shires <gshires@google.com> wrote: > >>> >> >Yes, I like the way you've defined the "speak" method to not change > >>> >the > >>> >> >play/pause state. Also, I didn't particularly like the word > >>> >"playback", > >>> >> >so thanks for the alternative "spoken". Here's updated definitions > >>> >> >with your suggestions incorporated. If there's no disagreement, > >>> >I'll > >>> >> >add them to the spec on Monday. > >>> >> > > >>> >> > > >>> >> >SpeechSynthesis Attributes > >>> >> > > >>> >> >pending attribute: > >>> >> >This attribute is true if the queue for this SpeechSynthesis object > >>> >> >contains any utterances which have not started speaking. > >>> >> > > >>> >> >speaking attribute: > >>> >> >This attribute is true if an utterance is being spoken. > >>> >Specifically if > >>> >> >an utterance has begun being spoken and has not completed being > >>> >spoken, > >>> >> >and is independent of whether this SpeechSynthesis object is in the > >>> >> >paused state. > >>> >> > > >>> >> >paused attribute: > >>> >> >The attribute is true when this SpeechSynthesis object is in the > >>> >paused > >>> >> >state. This state is independent of whether anything is in the > >>> >queue. > >>> >> >The > >>> >> >default state of a new SpeechSynthesis object is the non-paused > >>> >state. > >>> >> > > >>> >> > > >>> >> >SpeechSynthesis Methods > >>> >> > > >>> >> >The speak method > >>> >> >This method appends the utterance to the end of the queue for this > >>> >> >SpeechSynthesis object. It does not change the paused state of the > >>> >> >SpeechSynthesis object. If the SpeechSynthesis object is paused, > >>> >it > >>> >> >remains paused. If it is not paused, then this utterance is spoken > >>> >if > >>> >> >no other utterances are in the queue, else this utterance is queued > >>> >to > >>> >> >begin speaking after the other utterances in the queue have been > >>> >> >spoken. > >>> >> > > >>> >> >The cancel method > >>> >> >This method removes the specified utterance from the queue. If it > >>> >is > >>> >> >not in the queue, no changes are made. If the utterance removed is > >>> >> >being spoken, speaking ceases for that utterance and the next > >>> >utterance > >>> >> >in the queue (if > >>> >> >any) begins to be spoken. This method does not change the paused > >>> >state > >>> >> >of the SpeechSynthesis object. > >>> >> > > >>> >> >The pause method > >>> >> >This method puts the SpeechSynthesis object into the paused state. > >>> >If > >>> >> >an utterance was being spoken, it pauses mid-utterance. (If called > >>> >when > >>> >> >the SpeechSynthesis object was already in the paused state, it does > >>> >> >nothing.) > >>> >> > > >>> >> >The continue method > >>> >> >This method puts the SpeechSynthesis object into the non-paused > >>> >state. > >>> >> >If > >>> >> >an utterance was speaking (that is, its speaking attribute is > >>> >true), it > >>> >> >continues speaking the utterance at the point at which it was > >>> >paused, > >>> >> >else it begins speaking the next utterance in the queue (if any). > >>> >(If > >>> >> >called when the SpeechSynthesis object was already in the > >>> >non-paused > >>> >> >state, it does nothing.) > >>> >> > > >>> >> >The stop method. > >>> >> >This method puts the SpeechSynthesis object into the paused state > >>> >and > >>> >> >flushes the queue. It sets the speaking attribute to false and the > >>> >> >paused attribute to true. > >>> >> > > >>> >> > > >>> >> >SpeechSynthesisUtterance attributes > >>> >> > > >>> >> > > >>> >> >[[Note, I used SHOULD here because there may be some race-condition > >>> >> >edge-cases where it might not be ignored.]] > >>> >> > > >>> >> >text attribute: > >>> >> >The text to be synthesized for this utterance. Changes to this > >>> >> >attribute after the utterance has been added to the queue (by > >>> >calling > >>> >> >the speak > >>> >> >method) SHOULD be ignored. > >>> >> > > >>> >> >lang attribute: > >>> >> >[no change except to append the following] Changes to this > >>> >attribute > >>> >> >after the utterance has been added to the queue (by calling the > >>> >speak > >>> >> >method) > >>> >> >SHOULD be ignored. > >>> >> > > >>> >> >serviceURI attribute: > >>> >> >[no change except to append the following] Changes to this > >>> >attribute > >>> >> >after the utterance has been added to the queue (by calling the > >>> >speak > >>> >> >method) > >>> >> >SHOULD be ignored. > >>> >> > > >>> >> >speaking attribute: > >>> >> >This attribute is true if this specific utterance is currently > >>> >being > >>> >> >spoken. Specifically if this utterance has begun being spoken and > >>> >has > >>> >> >not completed being spoken. This is independent of whether the > >>> >> >SpeechSynthesis object is in a paused state. > >>> >> > > >>> >> >paused attribute: > >>> >> >This attribute is true if this specific utterance has begun to be > >>> >> >spoken, but has not completed and the SpeechSynthesis object is in > >>> >the > >>> >> >paused state. > >>> >> > > >>> >> >ended attribute: > >>> >> >This attribute is true if this specific utterance has completed > >>> >being > >>> >> >spoken. > >>> >> > > >>> >> >SpeechSynthesisUtterance events > >>> >> > > >>> >> >onstart event: > >>> >> >Fired when this utterance has begun to be spoken. > >>> >> > > >>> >> >onend event: > >>> >> >Fired when this utterance has completed being spoken. > >>> >> > > >>> >> > > >>> >> > > >>> >> >On Thu, Sep 13, 2012 at 10:25 AM, Dominic Mazzoni > >>> >> ><dmazzoni@google.com>wrote: > >>> >> > > >>> >> >> Thanks for proposing definitions. > >>> >> >> > >>> >> >> On Tue, Sep 11, 2012 at 3:02 AM, Glen Shires <gshires@google.com > > > >>> >> >wrote: > >>> >> >> > I propose the following definitions for the SpeechSynthesis > >>> >IDL: > >>> >> >> > > >>> >> >> > SpeechSynthesis Attributes > >>> >> >> > > >>> >> >> > pending attribute: > >>> >> >> > This attribute is true if the queue contains any utterances > >>> >which > >>> >> >have > >>> >> >> not > >>> >> >> > completed playback. > >>> >> >> > >>> >> >> I was imagining: This attribute is true if the queue contains any > >>> >> >> utterances which have not *started* speaking. > >>> >> >> > >>> >> >> > speaking attribute: > >>> >> >> > This attribute is true if playback is in progress. > >>> >> >> > >>> >> >> I don't like the word "playback", it doesn't fit when the speech > >>> >is > >>> >> >> generated dynamically. How about: This attribute is true if an > >>> >> >> utterance is being spoken. > >>> >> >> > >>> >> >> > paused attribute: > >>> >> >> > **** How is this different than (pending && !speaking) ? **** > >>> >> >> > >>> >> >> This is true if the speech synthesis system is in a paused state, > >>> >> >> independent of whether anything is speaking or queued. > >>> >> >> > >>> >> >> paused && speaking -> it was paused in the middle of an utterance > >>> >> >> paused && !speaking -> no utterance is speaking, but if you call > >>> >> >> speak(), nothing will happen because it's in a paused state. > >>> >> >> > >>> >> >> > > >>> >> >> > SpeechSynthesis Methods > >>> >> >> > > >>> >> >> > The speak method > >>> >> >> > This method appends the utterance to the end of a playback > >>> >queue. > >>> >> >If > >>> >> >> > playback is not in progress, it also begins playback of the > >>> >next > >>> >> >item in > >>> >> >> the > >>> >> >> > queue. > >>> >> >> > >>> >> >> What do you think about rewriting to not use "playback"? > >>> >> >> > >>> >> >> Also, my idea was that it would not begin playback if the system > >>> >is > >>> >> >in > >>> >> >> a paused state. > >>> >> >> > >>> >> >> > The cancel method > >>> >> >> > This method removes the first matching utterance (if any) from > >>> >the > >>> >> >> playback > >>> >> >> > queue. If playback is in progress and the utterance removed is > >>> >> >being > >>> >> >> played, > >>> >> >> > playback ceases for the utterance and the next utterance in the > >>> >> >queue (if > >>> >> >> > any) begins playing. > >>> >> >> > >>> >> >> Do we need to say "first matching"? Each utterance should be a > >>> >> >> specific object, it should be either in the queue or not. > >>> >> >> > >>> >> >> > The pause method > >>> >> >> > This method pauses the playback mid-utterance. If playback is > >>> >not > >>> >> >in > >>> >> >> > progress, it does nothing. > >>> >> >> > >>> >> >> I was assuming that calling it would set the system into a paused > >>> >> >> state, so that even a subsequent call to speak() would not do > >>> >> >anything > >>> >> >> other than enqueue. > >>> >> >> > >>> >> >> > The continue method > >>> >> >> > This method continues the playback at the point in the > >>> >utterance > >>> >> >and > >>> >> >> queue > >>> >> >> > in which it was paused. If playback is in progress, it does > >>> >> >nothing. > >>> >> >> > > >>> >> >> > The stop method. > >>> >> >> > This method stops playback mid-utterance and flushes the queue. > >>> >> >> > > >>> >> >> > > >>> >> >> > SpeechSynthesisUtterance attributes > >>> >> >> > > >>> >> >> > text attribute: > >>> >> >> > The text to be synthesized for this utterance. This attribute > >>> >must > >>> >> >not be > >>> >> >> > changed after onstart fires. > >>> >> >> > >>> >> >> I'd say: changes to this attribute after the utterance has been > >>> >> >added > >>> >> >> to the queue (by calling "speak") will be ignored. OR, we should > >>> >> >make > >>> >> >> it a DOM exception to modify it when it's in the speech queue. > >>> >> >> > >>> >> >> > paused attribute: > >>> >> >> > This attribute is true if this specific utterance is in the > >>> >queue > >>> >> >and has > >>> >> >> > not completed playback. > >>> >> >> > >>> >> >> I think this should only be true if it has begin speaking but not > >>> >> >> completed. > >>> >> >> > >>> >> >> - Dominic > >>> >> >> > >>> >> > >>> >> -- > >>> >> NOTICE TO RECIPIENT: > >>> >> THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE > >>> >TRANSMISSION, > >>> >> AND MAY BE A COMMUNICATION PRIVILEGED BY LAW. IF YOU RECEIVED THIS > >>> >E-MAIL > >>> >> IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING > >>> >OF THIS > >>> >> E-MAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE > >>> >ERROR > >>> >> BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM. > >>> >THANK YOU > >>> >> IN ADVANCE FOR YOUR COOPERATION. > >>> >> Reply to : legal@openstream.com > >>> >> > >>> >> > >>> >> > >>> >> > >>> > >>> -- > >>> NOTICE TO RECIPIENT: > >>> THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE > >>> TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGED BY LAW. IF YOU > RECEIVED > >>> THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR > >>> COPYING OF THIS E-MAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US > IMMEDIATELY > >>> OF THE ERROR BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR > >>> SYSTEM. THANK YOU IN ADVANCE FOR YOUR COOPERATION. > >>> Reply to : legal@openstream.com > >>> > >> > > >
Received on Tuesday, 18 September 2012 19:48:13 UTC