RE: Critical missing feature in SSML specification from Shires, Glen on 2003-01-29 (www-voice@w3.org from January to March 2003)

From: Shires, Glen <glen.shires@intel.com>
Date: Wed, 29 Jan 2003 15:30:26 -0700
To: Janina Sajka <janina@afb.net>, www-voice@w3.org
Message-ID: <3D90571297ABD511957400508BF29C4302FDF9D6@pysmsx101.py.intel.com>
Janina,
Excellent points!

- pause/resume
- jump backward
- jump forward
- stop, then start another item (another SSML document)

Are all excellent use cases. <MARK> is very useful in providing coarse
milestones, but I cannot envision an effective way of doing the above
actions with SSML markup tags. I can, however, envision how a scripted
object could have methods for each of the above.

Glen Shires
Intel Corporation

-----Original Message-----
From: Janina Sajka [mailto:janina@afb.net]
Sent: Wednesday, January 29, 2003 1:54 PM
To: www-voice@w3.org
Subject: Re: Critical missing feature in SSML specification



Perhaps we should expand the use case somewhat?

One reason to know where one stopped is to resume reading from that
point. Another is to back up aa bit and resume. These are common
practices.

Another, as with an iterated list, is to stop reading the current item
and move immediately, without pause, to the next (or previous)
item--also a common use.



Richard Schwerdtfeger writes:
> 
> 
> 
> 
> 
> 
> Glen,
> 
> I think I agree with you.
> 
> I don't think you want to include STOP in annotated markup.
> 
> For example you would not want to have to deal with strategically
inserting
> the Stop in the following sequence:
> 
> <speak><mark>Hello</Mark><Mark>World</Mark></speak>
> 
> The user issues a stop asyncrhonously to normal speaking. The buffering,
> latency, etc. could be vastly different based on the speach engine. You do
> however want to be able to send the following:
> 
> <STOP/>
> 
> At any time.
> 
> But at the same time know what markers were processed by the speech
engine.
> 
> Rich
> 
> 
> Rich Schwerdtfeger
> STSM, Software Group Accessibility Strategist
> Emerging Internet Technologies
> Chair, IBM Accessibility Architecture Review  Board
> schwer@us.ibm.com, Phone: 512-838-4593,T/L: 678-4593
> 
> "Two roads diverged in a wood, and I -
> I took the one less traveled by, and that has made all the difference.",
> Frost
> 
> 
> 
>

>                       "Shires, Glen"

>                       <glen.shires@inte        To:       David Poehlman
<poehlman1@comcast.net>, Richard                                
>                       l.com>
Schwerdtfeger/Austin/IBM@IBMUS, www-voice@w3.org

>                                                cc:       w3c-wai-pf@w3.org

>                       01/29/2003 01:30         Subject:  RE: Critical
missing feature in SSML specification                             
>                       PM

>

> 
> 
> 
> 
> David,
> If I understand your view on this, the "voice browser in this instance"
> would use something like DOM to manipulate the SSML. If so, I would think
> it
> would be difficult to know precisely where in the SSML document to insert
> the <STOP> tag because one would need to know exactly which point in the
> SSML document the renderer currently processing. While <MARK> can coarsely
> help with this, I envision numerous complexities in terms of
> pipeline-buffers, latency and race conditions. I would think
implementation
> would be vastly easier and more robust if a "stop" command (e.g. from a
> scripted object) was simply sent to the TTS-engine/renderer (as opposed to
> attempting to dynamically insert a markup tag at the proper position in
the
> markup).
> 
> Thanks,
> Glen Shires
> Intel Corporation
> 
> 
> -----Original Message-----
> From: David Poehlman [mailto:poehlman1@comcast.net]
> Sent: Wednesday, January 29, 2003 10:47 AM
> To: Shires, Glen; www-voice@w3.org
> Cc: w3c-wai-pf@w3.org
> Subject: Re: Critical missing feature in SSML specification
> 
> 
> I view ssl mark up in the same way that I view html or xml mark up.  The
> user agent retrieves it and from there it is under user agent controll.
> The
> voice browser in this instance would have to have the capability of
> manipulating the mark up in the same way s other agents manipulate html or
> xml.  While I understand a requirement for a full stop, it must be in post
> get since it could most likely be of no benefit in pre-get or in the data
> set.  In the case of streaming, it is still a function of another layer
> which exercises controll.  I would encourage that this idea be kept but
> enforced in a context where it can have effect.
> 
> ----- Original Message -----
> From: "Shires, Glen" <glen.shires@intel.com>
> To: <www-voice@w3.org>
> Cc: <w3c-wai-pf@w3.org>
> Sent: Wednesday, January 29, 2003 1:25 PM
> Subject: RE: Critical missing feature in SSML specification
> 
> 
> 
> Richard,
> I understand why the scenario you describe requires a "stop" command. I do
> not understand how a <STOP> markup tag would fulfill these requirements.
It
> seems to me that the SSML markup would be already generated and in process
> of being spoken by the TTS engine when an event that initiates the "stop"
> command occurs. I can envision how a scripted object might accomplish
this,
> but not how a <STOP> markup tag would do so.
> 
> Perhaps you could explain.
> 
> Thanks,
> Glen Shires
> Intel Corporation
> 
> 
> -----Original Message-----
> From: Richard Schwerdtfeger [mailto:schwer@us.ibm.com]
> Sent: Wednesday, January 29, 2003 9:37 AM
> To: www-voice@w3.org
> Cc: w3c-wai-pf@w3.org
> Subject: Critical missing feature in SSML specification
> Importance: High
> 
> 
> 
> 
> 
> 
> 
> In reviewing the SSML specification we (PF Group) overlooked an extremely
> critical missing feature in the last call draft.
> 
> It is absolutely essential that SSML support a <STOP> command.
> 
> Scenario:
> 
> Screen reader users will often hit the stop command to tell the speech
> synthesizer to stop speaking. Screen Readers would use the <MARK>
> annotation as a way to have the speech engine tell the screen reader when
> speech has been processed (marker processed). In the event that the user
> tells the screen reader to stop speaking the screen reader should be able
> to send a stop command to the speech engine which would utltimately flush
> the speech buffers. Markers not returned would help the screen reader know
> where the user left off in the user interface (maintain point of regard
> relative to what has been spoken).
> 
> I apologize for not submitting this in our last call review but this is a
> hard requirement. Otherwise, we SSML cannot support screen readers.
> 
> Rich
> 
> Rich Schwerdtfeger
> STSM, Software Group Accessibility Strategist
> Emerging Internet Technologies
> Chair, IBM Accessibility Architecture Review  Board
> schwer@us.ibm.com, Phone: 512-838-4593,T/L: 678-4593
> 
> "Two roads diverged in a wood, and I -
> I took the one less traveled by, and that has made all the difference.",
> Frost
> 

-- 
	
				Janina Sajka, Director
				Technology Research and Development
				Governmental Relations Group
				American Foundation for the Blind (AFB)

Email: janina@afb.net		Phone: (202) 408-8175
Received on Wednesday, 29 January 2003 17:31:02 UTC