RE: VoiceXML 2.0: Official Response #1 to Candidate Recommendation Issues from Guillaume Berche on 2003-12-16 (www-voice@w3.org from October to December 2003)

From: Guillaume Berche <guillaume.berche@eloquant.com>
Date: Tue, 16 Dec 2003 09:44:31 +0100
To: "McGlashan, Scott" <scott.mcglashan@hp.com>
Cc: <www-voice@w3.org>
Message-ID: <ELEGLIHGLLIBFPCIGAKGCEGDDMAA.guillaume.berche@eloquant.com>
Scott,

Thanks for your detailed response concerning the point on recording. This
clarification certainly helps in understanding the specs on this point.

> Guillaume, please let us know whether you accept this disposition. If
> you do not explicit require the clarification concerning the throwing of
> <noinput> and <nomatch> events by recognition during recording, the
> group will use its discretion in whether the clarification needs to be
> applied.

I am indeed satisfied by this disposition, and do not explicit require the
clarification concerning the throwing of <noinput> and <nomatch> events by
recognition during recording given that this feature is optional, seldomly
supported. More over your explanation is quite clear to me and publicly
referenceable.

Thanks again and best regards,

Guillaume.

> -----Original Message-----
> From: www-voice-request@w3.org [mailto:www-voice-request@w3.org]On
> Behalf Of McGlashan, Scott
> Sent: dimanche 14 decembre 2003 17:43
> To: Guillaume Berche
> Cc: www-voice@w3.org
> Subject: RE: VoiceXML 2.0: Official Response #1 to Candidate
> Recommendation Issues
>
>
>
> Guillaume,
>
> Thank you again for your timely response and your acceptance of our
> disposition on these issues.
>
> On your one remaining issue, CR5-13. We propose the following revised
> resolution.
>
> CR5-13 accepted with modifications
>
> We believe that when recording begins is clearly defined: in Section
> 2.3.6, it states:
>
> "A recording begins at the earliest after the playback of any prompts
> (including the 'beep' tone if defined). As an optimization, a platform
> may begin recording when the user starts speaking."
>
> i.e. the recording may include initial silence, etc if the platform does
> not use the optimization (e.g. voice activity detection). With the
> optimization, the recording can begin with the user's speech. Whether
> music or other audio triggers voice activity detection is
> platform-specific. Note that this behavior applies independent of
> whether speech recognition is supported (while the recording and
> recognition processes use the same audio data stream, theese processes
> are independent and therefore their voice activity detection mechanism
> may be different).
>
> The timeout interval is clearly defined: "A timeout interval is defined
> to begin immediately after prompt playback (including the 'beep' tone if
> defined) and its duration is determined by the 'timeout' property."
>
> The timeout interval has an effect on both recording and recognition
> (which are logically independent).
>
> For recording, the impact is specified in "If the timeout interval is
> exceeded before recording begins, then a <noinput> event is thrown." In
> the case of non-optimized recording, recording always begins after
> prompt playback, so <noinput> would never be thrown. With optimized
> recording, however, <noinput> may be thrown if no voice activity is
> detected before timeout interval elapses.
>
> For recognition, the situation is more complex. We are modifying the
> specification (due to implementation report feedback) so that if
> recognition is supported during recording (this is an optional feature),
> then only non-local speech grammars are active. If a non-local speech
> grammar is matched by audio input, then execution is immediately
> transferred its enclosing element. This raises the issue of whether a
> <noinput> or <nomatch> could be thrown by the recognition process. A
> <noinput> could be generated if the timeout interval has elapsed. A
> <nomatch> could be generated if the audio triggers recognition but does
> not match the active grammar. Our belief is that throwing these events
> by the recognition process during recording is undesirable and not what
> VoiceXML authors expect. Consequently, we are considered clarifying the
> specification to make it clear that <noinput> and <nomatch> events are
> never thrown from the recognition process during recording.
>
>
> Guillaume, please let us know whether you accept this disposition. If
> you do not explicit require the clarification concerning the throwing of
> <noinput> and <nomatch> events by recognition during recording, the
> group will use its discretion in whether the clarification needs to be
> applied.
>
> Thanks
>
> Scott
>
>
>
Received on Tuesday, 16 December 2003 03:49:25 UTC