Request for clarification on the behaviour of <record> (fwd) from Dave Raggett on 2003-04-24 (www-voice@w3.org from April to June 2003)

From: Dave Raggett <dsr@w3.org>
Date: Thu, 24 Apr 2003 23:41:57 +0100 (BST)
To: www-voice@w3.org
cc: Ufuk Kayserilioglu <kayseri@phonoclick.com>
Message-ID: <Pine.LNX.4.53.0304242341190.1201@localhost.localdomain>

---------- Forwarded message ----------
Date: Thu, 24 Apr 2003 12:10:20 -0400 (EDT)
From: Ufuk Kayserilioglu <kayseri@phonoclick.com>
To: www-voice@w3.org
Subject: [Moderator Action] Request for clarification on the behaviour of
    <record>

Hi,

First of all I would like to start with thanks to all involved in
putting together a multitude of successful specs such as VoiceXML, SRGF,
et. al. However, time to time there are areas where these specs seem to
be unclear from an implementor's point-of-view.

We are trying to implement the <record> tag in our Voice Browser in a
comformant way; however, we cannot understand what, clearly, are the
requirements from a browser for this tag. My points can be summed up as
follows:

I) The main confusion arises form the behaviour of bargein="true"
prompts in <record>. According to Fig 7 in section 2.3.6 (lower left
corner) bargein controls apply to audio queued within <record>. On the
other hand, a few lines below, it is stated:

"A /recording begins/ at the earliest after the playback of any prompts
(including the 'beep' tone if defined). As an optimization, a platform
may begin recording when the user starts speaking."

Now, if recording does not begin DURING the prompt playback, then how
can those prompts be barged-in? Or, should we understand that if the
user barges-in with voice during prompt playback THEN recording should
be started? A clarification of how <record> and audio queued within
<record> with barge-in interacts, in our opinion, is badly needed.

II) The second comment that baffles us in the spec is:

"If no audio is collected during execution of <record>, then the record
variable remains unfilled (note
<http://www.w3.org/TR/voicexml20/#unfilled_record>). This can occur, for
example, when DTMF or speech input is received during prompt playback or
the timeout interval (if the developer wants input during prompt
playback to initiate recording, then prompts should be placed in an
immediately preceding <field> with a zero timeout)." (Section 2.3.6)

This comment is weird in two ways:

  1) How can record variable be unfilled "when DTMF or speech input is
received during ... the timeout interval"? This seems to be the primary
method of filling a record variable.

  2) We cannot grasp, in any way, how it would be possible to achieve
what the spec author has stated within the parantheses. If there is
preceeding <field> with zero timeout then:
    i) if the user starts speaking while the prompts in the <field> are
playing then the input goes to the processing of the field and will be
matched to whatever grammar is specified for it, or will throw a "nomatch",
    ii) else if the user waits for the prompts to finish, then a
"noinput" event will be thrown.
  In neither case, will the input be going into the <record> tag that
succeeds the <field> tag. If the spec is trying to say something else
then it should be clearly explained.

We would like the VBWG to clarify these issues before we go on with our
implemetation.

Thank you,

Ufuk Kayserilioglu
PhonoClick
www.phonoclick.com

Received on Thursday, 24 April 2003 18:38:53 UTC