- From: Dominic Mazzoni <dmazzoni@google.com>
- Date: Tue, 18 Sep 2012 00:05:51 -0700
- To: Glen Shires <gshires@google.com>
- Cc: Nagesh Kharidi <nagesh@openstream.com>, Jim Barnett <Jim.Barnett@genesyslab.com>, Hans Wennborg <hwennborg@google.com>, olli@pettay.fi, public-speech-api@w3.org
Looking good. Just one suggestion: how about replacing "this SpeechSynthesis object" with "the global SpeechSynthesis instance" or something that indicates there's just a single global SpeechSynthesis. - Dominic On Mon, Sep 17, 2012 at 10:25 PM, Glen Shires <gshires@google.com> wrote: > I've updated the spec with the above SpeechSynthesis and > SpeechSynthesisUtterance IDL and definitions: > https://dvcs.w3.org/hg/speech-api/rev/b036c78e9445 > > As always, the current draft spec is at: > http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html > > On Sat, Sep 15, 2012 at 1:31 PM, Glen Shires <gshires@google.com> wrote: >> >> Nagesh, >> I agree that cancelAll() is useful and can make code simpler because it >> doesn't affect the paused state. In fact, I propose that we add cancelAll() >> and remove stop() -- because the stop function is probably less common and >> can easily be accomplished with two calls: cancelAll() and pause(). >> >> Also, since canceling a specific utterance is not very useful, and >> questionable as Jerry states, I propose eliminating cancel(utterance). If we >> do that, then we could rename cancelAll() more simply as cancel(). >> >> Thus, I propose this IDL: >> >> interface SpeechSynthesis { >> static readonly attribute boolean pending; >> static readonly attribute boolean speaking; >> static readonly attribute boolean paused; >> >> static void speak(SpeechSynthesisUtterance utterance); >> static void cancel(); >> static void pause(); >> static void continue(); >> } >> >> and I propose this new definition of cancel: >> >> The cancel method >> This method removes all utterances from the queue. If an utterance is >> being spoken, speaking ceases immediately. This method does not change the >> paused state of the SpeechSynthesis object. >> >> /Glen Shires >> >> >> On Sat, Sep 15, 2012 at 3:25 AM, Nagesh Kharidi <nagesh@openstream.com> >> wrote: >>> >>> Please see inline. >>> >>> Regards, >>> Nagesh >>> >>> On Fri, 14 Sep 2012 12:59:41 -0700 >>> Glen Shires <gshires@google.com> wrote: >>> >> Provide the ability to cancel all currently queued utterances. >>> > >>> >The stop() method cancels all queued utterances. (Dominic proposed >>> >that >>> >this message be named stopAndFlushQueue(), would that name be more >>> >clear?) >>> >>> In addition to canceling all queued utterances, the stop() method also >>> pauses the SpeechSynthesis object. A separate cancelAll() method would >>> be useful, without which, if a new utterance is to be spoken >>> immediately, we would have to do : >>> speechSynthesis.stop(); >>> speechSynthesis.continue(); >>> speechSynthesis.speak(utterance); >>> >>> With a cancelAll() method, this would be: >>> speechSynthesis.cancelAll(); >>> speechSynthesis.speak(utterance); >>> >>> Since this would be such a common usage, we could make it even easier >>> for developers by either: >>> - providing a speakImmediate(utterance) method that cancels all queued >>> utterances and then starts speaking the new utterance >>> or >>> - adding a second parameter as follows to the speak() method: >>> speechSynthesis.speak(utterance, speakImmediately); >>> If speakImmediately is true, all currently queued utterances will be >>> canceled and the new utterance will be spoken. >>> >>> > >>> >Also, what is the use case for the current cancel(utterance) method? >>> > In >>> >all the use cases I envision, you'd want to cancel all queued >>> >utterances. >>> >Can we eliminate cancel() ? >>> >>> I also agree that canceling a specific utterance is not very useful. >>> Canceling all queued utterances would be more common than canceling a >>> specific utterance. >>> >>> > >>> > >>> >> New speakNext SpeechSynthesis method - append the utterance to the >>> >beginning of the queue >>> > >>> >I'd like more discussion on this. What are the use cases? What are the >>> >edge >>> >cases (e.g. If there's a race-condition, the current utterance may >>> >finish >>> >and the second in the queue may begin speaking before this new >>> >utterance is >>> >inserted). >>> >>> Use case for speakNext() method: Consider a news application that plays >>> the latest news items. It queues all news items to be played. Now if >>> there is a new "breaking news" item that comes in, the speakNext() >>> method can be used to play it as soon as possible without canceling the >>> already queued items. >>> >>> >>> > >>> > >>> >> Question: Can a cancelled utterance be re-queued? >>> > >>> >Good question, and also, what is the lifetime of a >>> >SpeechSynthesisUtterance >>> >object and who owns it. There's at least 3 possibilities: >>> > >>> >1. The speak() method takes ownership when it adds it to the queue, >>> >then it >>> >would presumably be destroyed upon cancel or onend. >>> > (This raises the questions: what usefulness is >>> >the SpeechSynthesisUtterance object attribute "ended", since the >>> >object >>> >will be destroyed when it turns true. It also makes it messy to use >>> >the >>> >other readonly attributes because the object may be deleted suddenly. >>> >Also, what if the author deletes the SpeechSynthesisUtterance object >>> >prior >>> >to it being spoken. One easy way to accidentally create this bug is >>> >to >>> >define the SpeechSynthesisUtterance object in a method that goes out >>> >of >>> >scope.) >>> > >>> >2. The speak() method does not take ownership when it adds it directly >>> >to >>> >queue. >>> > (This raises the question: what if the author deletes the >>> >SpeechSynthesisUtterance object prior to it being spoken. One easy >>> >way to >>> >accidentally create this bug is to define the SpeechSynthesisUtterance >>> >object in a method that goes out of scope.) >>> > >>> >3. The speak() method does not take ownership, it makes a copy of it >>> >when >>> >it adds it to queue . >>> > (This raises the question: how can the author's >>> >original SpeechSynthesisUtterance object readonly attributes >>> >(speaking, >>> >paused, ended) reflect the state of the copy on the queue.) >>> > >>> > >>> >To resolve these issues, I propose the following, because I think it's >>> >the >>> >cleanest solution and easiest for authors, since they can create and >>> >destroy objects, and go out of scope, without worrying about the >>> >speaking >>> >queue timing: >>> > >>> >The speak() method does not take ownership of the >>> >SpeechSynthesisUtterance >>> >object, it makes a copy of it when it adds it to queue. We eliminate >>> >the SpeechSynthesisUtterance readonly attributes, relying instead on >>> >events >>> >that indicate change in state, including new events for: onpause, >>> >onresume. >>> > >>> >Because it's a copy of the object, this clarifies that: >>> >- changes to the original SpeechSynthesisUtterance object after >>> >calling >>> >speak() do not affect the copy on the queue. >>> >- the same SpeechSynthesisUtterance object can be used to call speak() >>> >multiple times, (even after a copy of which was spoken or cancelled). >>> > >>> >The new IDL would be: >>> > >>> > interface SpeechSynthesisUtterance { >>> > attribute DOMString text; >>> > attribute DOMString lang; >>> > attribute DOMString serviceURI; >>> > >>> > attribute Function onstart; >>> > attribute Function onend; >>> >* attribute Function onpause;* >>> >* attribute Function onresume;* >>> > } >>> > >>> > >>> >And the new definition: >>> > >>> >The speak method >>> >This method appends *a copy of* the utterance to the end of the queue >>> >for >>> >this SpeechSynthesis object. It does not change the paused state of >>> >the >>> >SpeechSynthesis object. If the SpeechSynthesis object is paused, it >>> >remains paused. If it is not paused, then this utterance is spoken if >>> >no >>> >other utterances are in the queue, else this utterance is queued to >>> >begin >>> >speaking after the other utterances in the queue have been spoken. >>> > >>> > >>> >/Glen Shires >>> > >>> > >>> >On Fri, Sep 14, 2012 at 6:05 AM, Jim Barnett >>> ><Jim.Barnett@genesyslab.com>wrote: >>> > >>> >> I would think that cancelling all utterances would be the more >>> >common use >>> >> case (so we ought to make it easy). Question: Can a cancelled >>> >utterance >>> >> be re-queued? >>> >> >>> >> - Jim >>> >> >>> >> -----Original Message----- >>> >> From: Nagesh Kharidi [mailto:nagesh@openstream.com] >>> >> Sent: Friday, September 14, 2012 8:58 AM >>> >> To: Glen Shires; Dominic Mazzoni >>> >> Cc: Hans Wennborg; olli@pettay.fi; public-speech-api@w3.org >>> >> Subject: Re: TTS proposal to split Utterance into its own interface >>> >> >>> >> I would like to propose the following: >>> >> 1. Provide the ability to cancel all currently queued utterances. A >>> >new >>> >> cancelAll method could be added. Alternately, invoking the cancel >>> >method >>> >> without the utterance parameter could imply cancel all utterances. >>> >> >>> >> 2. New speakNext SpeechSynthesis method >>> >> This method will append the utterance to the beginning of the queue. >>> >> >>> >> 3. New oncancel SpeechSynthesisUtterance event Fired when the >>> >utterance is >>> >> canceled. >>> >> >>> >> 4. New canceled SpeechSynthesisUtterance attribute true if the >>> >utterance >>> >> is canceled. >>> >> >>> >> >>> >> I also had a question regarding the stop method: Is "flushes the >>> >queue" >>> >> equivalent to calling cancel on all utterances in the queue? If so, >>> >I >>> >> would like to suggest changing "flushes the queue" to "cancels all >>> >> utterances in the queue". >>> >> >>> >> Regards, >>> >> Nagesh >>> >> >>> >> On Thu, 13 Sep 2012 14:13:56 -0700 >>> >> Glen Shires <gshires@google.com> wrote: >>> >> >Yes, I like the way you've defined the "speak" method to not change >>> >the >>> >> >play/pause state. Also, I didn't particularly like the word >>> >"playback", >>> >> >so thanks for the alternative "spoken". Here's updated definitions >>> >> >with your suggestions incorporated. If there's no disagreement, >>> >I'll >>> >> >add them to the spec on Monday. >>> >> > >>> >> > >>> >> >SpeechSynthesis Attributes >>> >> > >>> >> >pending attribute: >>> >> >This attribute is true if the queue for this SpeechSynthesis object >>> >> >contains any utterances which have not started speaking. >>> >> > >>> >> >speaking attribute: >>> >> >This attribute is true if an utterance is being spoken. >>> >Specifically if >>> >> >an utterance has begun being spoken and has not completed being >>> >spoken, >>> >> >and is independent of whether this SpeechSynthesis object is in the >>> >> >paused state. >>> >> > >>> >> >paused attribute: >>> >> >The attribute is true when this SpeechSynthesis object is in the >>> >paused >>> >> >state. This state is independent of whether anything is in the >>> >queue. >>> >> >The >>> >> >default state of a new SpeechSynthesis object is the non-paused >>> >state. >>> >> > >>> >> > >>> >> >SpeechSynthesis Methods >>> >> > >>> >> >The speak method >>> >> >This method appends the utterance to the end of the queue for this >>> >> >SpeechSynthesis object. It does not change the paused state of the >>> >> >SpeechSynthesis object. If the SpeechSynthesis object is paused, >>> >it >>> >> >remains paused. If it is not paused, then this utterance is spoken >>> >if >>> >> >no other utterances are in the queue, else this utterance is queued >>> >to >>> >> >begin speaking after the other utterances in the queue have been >>> >> >spoken. >>> >> > >>> >> >The cancel method >>> >> >This method removes the specified utterance from the queue. If it >>> >is >>> >> >not in the queue, no changes are made. If the utterance removed is >>> >> >being spoken, speaking ceases for that utterance and the next >>> >utterance >>> >> >in the queue (if >>> >> >any) begins to be spoken. This method does not change the paused >>> >state >>> >> >of the SpeechSynthesis object. >>> >> > >>> >> >The pause method >>> >> >This method puts the SpeechSynthesis object into the paused state. >>> >If >>> >> >an utterance was being spoken, it pauses mid-utterance. (If called >>> >when >>> >> >the SpeechSynthesis object was already in the paused state, it does >>> >> >nothing.) >>> >> > >>> >> >The continue method >>> >> >This method puts the SpeechSynthesis object into the non-paused >>> >state. >>> >> >If >>> >> >an utterance was speaking (that is, its speaking attribute is >>> >true), it >>> >> >continues speaking the utterance at the point at which it was >>> >paused, >>> >> >else it begins speaking the next utterance in the queue (if any). >>> >(If >>> >> >called when the SpeechSynthesis object was already in the >>> >non-paused >>> >> >state, it does nothing.) >>> >> > >>> >> >The stop method. >>> >> >This method puts the SpeechSynthesis object into the paused state >>> >and >>> >> >flushes the queue. It sets the speaking attribute to false and the >>> >> >paused attribute to true. >>> >> > >>> >> > >>> >> >SpeechSynthesisUtterance attributes >>> >> > >>> >> > >>> >> >[[Note, I used SHOULD here because there may be some race-condition >>> >> >edge-cases where it might not be ignored.]] >>> >> > >>> >> >text attribute: >>> >> >The text to be synthesized for this utterance. Changes to this >>> >> >attribute after the utterance has been added to the queue (by >>> >calling >>> >> >the speak >>> >> >method) SHOULD be ignored. >>> >> > >>> >> >lang attribute: >>> >> >[no change except to append the following] Changes to this >>> >attribute >>> >> >after the utterance has been added to the queue (by calling the >>> >speak >>> >> >method) >>> >> >SHOULD be ignored. >>> >> > >>> >> >serviceURI attribute: >>> >> >[no change except to append the following] Changes to this >>> >attribute >>> >> >after the utterance has been added to the queue (by calling the >>> >speak >>> >> >method) >>> >> >SHOULD be ignored. >>> >> > >>> >> >speaking attribute: >>> >> >This attribute is true if this specific utterance is currently >>> >being >>> >> >spoken. Specifically if this utterance has begun being spoken and >>> >has >>> >> >not completed being spoken. This is independent of whether the >>> >> >SpeechSynthesis object is in a paused state. >>> >> > >>> >> >paused attribute: >>> >> >This attribute is true if this specific utterance has begun to be >>> >> >spoken, but has not completed and the SpeechSynthesis object is in >>> >the >>> >> >paused state. >>> >> > >>> >> >ended attribute: >>> >> >This attribute is true if this specific utterance has completed >>> >being >>> >> >spoken. >>> >> > >>> >> >SpeechSynthesisUtterance events >>> >> > >>> >> >onstart event: >>> >> >Fired when this utterance has begun to be spoken. >>> >> > >>> >> >onend event: >>> >> >Fired when this utterance has completed being spoken. >>> >> > >>> >> > >>> >> > >>> >> >On Thu, Sep 13, 2012 at 10:25 AM, Dominic Mazzoni >>> >> ><dmazzoni@google.com>wrote: >>> >> > >>> >> >> Thanks for proposing definitions. >>> >> >> >>> >> >> On Tue, Sep 11, 2012 at 3:02 AM, Glen Shires <gshires@google.com> >>> >> >wrote: >>> >> >> > I propose the following definitions for the SpeechSynthesis >>> >IDL: >>> >> >> > >>> >> >> > SpeechSynthesis Attributes >>> >> >> > >>> >> >> > pending attribute: >>> >> >> > This attribute is true if the queue contains any utterances >>> >which >>> >> >have >>> >> >> not >>> >> >> > completed playback. >>> >> >> >>> >> >> I was imagining: This attribute is true if the queue contains any >>> >> >> utterances which have not *started* speaking. >>> >> >> >>> >> >> > speaking attribute: >>> >> >> > This attribute is true if playback is in progress. >>> >> >> >>> >> >> I don't like the word "playback", it doesn't fit when the speech >>> >is >>> >> >> generated dynamically. How about: This attribute is true if an >>> >> >> utterance is being spoken. >>> >> >> >>> >> >> > paused attribute: >>> >> >> > **** How is this different than (pending && !speaking) ? **** >>> >> >> >>> >> >> This is true if the speech synthesis system is in a paused state, >>> >> >> independent of whether anything is speaking or queued. >>> >> >> >>> >> >> paused && speaking -> it was paused in the middle of an utterance >>> >> >> paused && !speaking -> no utterance is speaking, but if you call >>> >> >> speak(), nothing will happen because it's in a paused state. >>> >> >> >>> >> >> > >>> >> >> > SpeechSynthesis Methods >>> >> >> > >>> >> >> > The speak method >>> >> >> > This method appends the utterance to the end of a playback >>> >queue. >>> >> >If >>> >> >> > playback is not in progress, it also begins playback of the >>> >next >>> >> >item in >>> >> >> the >>> >> >> > queue. >>> >> >> >>> >> >> What do you think about rewriting to not use "playback"? >>> >> >> >>> >> >> Also, my idea was that it would not begin playback if the system >>> >is >>> >> >in >>> >> >> a paused state. >>> >> >> >>> >> >> > The cancel method >>> >> >> > This method removes the first matching utterance (if any) from >>> >the >>> >> >> playback >>> >> >> > queue. If playback is in progress and the utterance removed is >>> >> >being >>> >> >> played, >>> >> >> > playback ceases for the utterance and the next utterance in the >>> >> >queue (if >>> >> >> > any) begins playing. >>> >> >> >>> >> >> Do we need to say "first matching"? Each utterance should be a >>> >> >> specific object, it should be either in the queue or not. >>> >> >> >>> >> >> > The pause method >>> >> >> > This method pauses the playback mid-utterance. If playback is >>> >not >>> >> >in >>> >> >> > progress, it does nothing. >>> >> >> >>> >> >> I was assuming that calling it would set the system into a paused >>> >> >> state, so that even a subsequent call to speak() would not do >>> >> >anything >>> >> >> other than enqueue. >>> >> >> >>> >> >> > The continue method >>> >> >> > This method continues the playback at the point in the >>> >utterance >>> >> >and >>> >> >> queue >>> >> >> > in which it was paused. If playback is in progress, it does >>> >> >nothing. >>> >> >> > >>> >> >> > The stop method. >>> >> >> > This method stops playback mid-utterance and flushes the queue. >>> >> >> > >>> >> >> > >>> >> >> > SpeechSynthesisUtterance attributes >>> >> >> > >>> >> >> > text attribute: >>> >> >> > The text to be synthesized for this utterance. This attribute >>> >must >>> >> >not be >>> >> >> > changed after onstart fires. >>> >> >> >>> >> >> I'd say: changes to this attribute after the utterance has been >>> >> >added >>> >> >> to the queue (by calling "speak") will be ignored. OR, we should >>> >> >make >>> >> >> it a DOM exception to modify it when it's in the speech queue. >>> >> >> >>> >> >> > paused attribute: >>> >> >> > This attribute is true if this specific utterance is in the >>> >queue >>> >> >and has >>> >> >> > not completed playback. >>> >> >> >>> >> >> I think this should only be true if it has begin speaking but not >>> >> >> completed. >>> >> >> >>> >> >> - Dominic >>> >> >> >>> >> >>> >> -- >>> >> NOTICE TO RECIPIENT: >>> >> THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE >>> >TRANSMISSION, >>> >> AND MAY BE A COMMUNICATION PRIVILEGED BY LAW. IF YOU RECEIVED THIS >>> >E-MAIL >>> >> IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING >>> >OF THIS >>> >> E-MAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE >>> >ERROR >>> >> BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM. >>> >THANK YOU >>> >> IN ADVANCE FOR YOUR COOPERATION. >>> >> Reply to : legal@openstream.com >>> >> >>> >> >>> >> >>> >> >>> >>> -- >>> NOTICE TO RECIPIENT: >>> THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE >>> TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGED BY LAW. IF YOU RECEIVED >>> THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR >>> COPYING OF THIS E-MAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY >>> OF THE ERROR BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR >>> SYSTEM. THANK YOU IN ADVANCE FOR YOUR COOPERATION. >>> Reply to : legal@openstream.com >>> >> >
Received on Tuesday, 18 September 2012 07:06:20 UTC