- From: T. V. Raman <tvraman@us.ibm.com>
- Date: Wed, 29 Jan 2003 16:31:31 -0800
- To: "Shires, Glen" <glen.shires@intel.com>
- Cc: David Poehlman <poehlman1@comcast.net>, Richard Schwerdtfeger <schwer@us.ibm.com>, www-voice@w3.org, w3c-wai-pf@w3.org
The mark tag is designed to return the index mark as things get spoken. As others have pointed out stop itself does not belong in SSML; rather, an app that is generating a stream of SSML should insert mark commands at appropriate points so that when it issues a stop command to the engine to which it has previously sent an SSML stream, it can then know how far speech has progressed based on its tracking of the returned index marks. >>>>> "Shires," == Shires, Glen <glen.shires@intel.com> writes: Shires,> David, If I understand your view on this, the "voice Shires,> browser in this instance" would use something like DOM to Shires,> manipulate the SSML. If so, I would think it would be Shires,> difficult to know precisely where in the SSML document to Shires,> insert the <STOP> tag because one would need to know Shires,> exactly which point in the SSML document the renderer Shires,> currently processing. While <MARK> can coarsely help with Shires,> this, I envision numerous complexities in terms of Shires,> pipeline-buffers, latency and race conditions. I would Shires,> think implementation would be vastly easier and more Shires,> robust if a "stop" command (e.g. from a scripted object) Shires,> was simply sent to the TTS-engine/renderer (as opposed to Shires,> attempting to dynamically insert a markup tag at the Shires,> proper position in the markup). Shires,> Thanks, Glen Shires Intel Corporation Shires,> -----Original Message----- From: David Poehlman Shires,> [mailto:poehlman1@comcast.net] Sent: Wednesday, January Shires,> 29, 2003 10:47 AM To: Shires, Glen; www-voice@w3.org Cc: Shires,> w3c-wai-pf@w3.org Subject: Re: Critical missing feature Shires,> in SSML specification Shires,> I view ssl mark up in the same way that I view html or Shires,> xml mark up. The user agent retrieves it and from there Shires,> it is under user agent controll. The voice browser in Shires,> this instance would have to have the capability of Shires,> manipulating the mark up in the same way s other agents Shires,> manipulate html or xml. While I understand a requirement Shires,> for a full stop, it must be in post get since it could Shires,> most likely be of no benefit in pre-get or in the data Shires,> set. In the case of streaming, it is still a function of Shires,> another layer which exercises controll. I would Shires,> encourage that this idea be kept but enforced in a Shires,> context where it can have effect. Shires,> ----- Original Message ----- From: "Shires, Glen" Shires,> <glen.shires@intel.com> To: <www-voice@w3.org> Cc: Shires,> <w3c-wai-pf@w3.org> Sent: Wednesday, January 29, 2003 Shires,> 1:25 PM Subject: RE: Critical missing feature in SSML Shires,> specification Shires,> Richard, I understand why the scenario you describe Shires,> requires a "stop" command. I do not understand how a Shires,> <STOP> markup tag would fulfill these requirements. It Shires,> seems to me that the SSML markup would be already Shires,> generated and in process of being spoken by the TTS Shires,> engine when an event that initiates the "stop" command Shires,> occurs. I can envision how a scripted object might Shires,> accomplish this, but not how a <STOP> markup tag would do Shires,> so. Shires,> Perhaps you could explain. Shires,> Thanks, Glen Shires Intel Corporation Shires,> -----Original Message----- From: Richard Schwerdtfeger Shires,> [mailto:schwer@us.ibm.com] Sent: Wednesday, January 29, Shires,> 2003 9:37 AM To: www-voice@w3.org Cc: w3c-wai-pf@w3.org Shires,> Subject: Critical missing feature in SSML specification Shires,> Importance: High Shires,> In reviewing the SSML specification we (PF Group) Shires,> overlooked an extremely critical missing feature in the Shires,> last call draft. Shires,> It is absolutely essential that SSML support a <STOP> Shires,> command. Shires,> Scenario: Shires,> Screen reader users will often hit the stop command to Shires,> tell the speech synthesizer to stop speaking. Screen Shires,> Readers would use the <MARK> annotation as a way to have Shires,> the speech engine tell the screen reader when speech has Shires,> been processed (marker processed). In the event that the Shires,> user tells the screen reader to stop speaking the screen Shires,> reader should be able to send a stop command to the Shires,> speech engine which would utltimately flush the speech Shires,> buffers. Markers not returned would help the screen Shires,> reader know where the user left off in the user interface Shires,> (maintain point of regard relative to what has been Shires,> spoken). Shires,> I apologize for not submitting this in our last call Shires,> review but this is a hard requirement. Otherwise, we SSML Shires,> cannot support screen readers. Shires,> Rich Shires,> Rich Schwerdtfeger STSM, Software Group Accessibility Shires,> Strategist Emerging Internet Technologies Chair, IBM Shires,> Accessibility Architecture Review Board Shires,> schwer@us.ibm.com, Phone: 512-838-4593,T/L: 678-4593 Shires,> "Two roads diverged in a wood, and I - I took the one Shires,> less traveled by, and that has made all the difference.", Shires,> Frost -- Best Regards, --raman ------------------------------------------------------------ T. V. Raman: PhD (Cornell University) IBM Research: Human Language Technologies Architect: Conversational And Multimodal WWW Standards Phone: 1 (408) 927 2608 T-Line 457-2608 Fax: 1 (408) 927 3012 Cell: 1 650 799 5724 Email: tvraman@us.ibm.com WWW: http://www.cs.cornell.edu/home/raman AIM: TVRaman PGP: http://emacspeak.sf.net/raman.asc Snail: IBM Almaden Research Center, 650 Harry Road San Jose 95120
Received on Wednesday, 29 January 2003 19:32:05 UTC