W3C home > Mailing lists > Public > www-voice@w3.org > July to September 2004

RE: comments and questions by a user of VXML

From: McGlashan, Scott <scott.mcglashan@hp.com>
Date: Mon, 30 Aug 2004 20:34:00 +0200
Message-ID: <990C7D7B4096DC49B7BDB5D4F56E5B9F04D008@sooexc02.emea.cpqcorp.net>
To: "Dennis de Champeaux" <ddc@cutter.rexx.com>, <www-voice@w3.org>

Hi Dennis,

Thanks for your comments on using VoiceXML 2.0. It is always good to
hear feedback from developers, positive or negative. 

On your general comments concerning VoiceXML as a programming language,
VoiceXML follows the general W3C pattern of defining an XML markup
language to generate user interface behavior where the assumption is
that server side application development environment (Java, .NET, C++,
etc) generates this markup. Previous attempts to standardize speech
application development using APIs didn't get wide cross-industry
support, hence VoiceXML as an XML-based approach which has achieved
general acceptance across the industry. However, I do understand that
this model will not satisfy everyone, especially those coming from the
perspective of API-based programming models. This is an issue that we
are interested in for the next major version of VoiceXML which is
intended to have a layered model with primitive objects (e.g. for spech
recognition) which are then exposed in markup on an XML element layer
(where you could use ECMAScript to access and manipulate these
primitives). Given the W3C focus on markup, it is not our current
intention to standardize binding these objects to programming languages
- that could be done by other standardization activities outside W3C
(e.g. Lisp bindings by a group focusing on Lisp). However, W3C's
decisions are made by its members and members can lobby for adding
bindings to programming languages such as Lisp. 

On some of your other general points:
- VoiceXML's support for grammar is based on SRGS (see
http://www.w3.org/TR/speech-grammar for documentation).  Modelled on
JSapi, this provides BNF and XML construct for speech and DTMF grammars
based on a Context-Free grammar processing model which is sufficient for
the majority of speech applications. If you have a specific use case
which it doesn't cover, please let us know.
- dialog pragmatics. Can you explain what you are looking for here? 

Specific VoiceXML points:
- Since the <form> (and its associated interpretation algorithm) is the
heart of VoiceXML, constructs like <block> are nested inside it. The
<block> element is explained in 2.3.2 in the VoiceXML specification.
- For speech output, VoiceXML uses SSML (Speech Synthesis Markup
Language - http://www.w3.org/TR/speech-synthesis). VoiceXML maps its
<prompt> element into SSML's <speak> element, hence the reason that
<say-as> must be included inside a <prompt>. However, it would be
possible that in addition to bare text, VoiceXML could have allowed
other SSML constructs without a <prompt> element. 
- sentence and paragraph constructs are part of SSML and explained in
that specification (hyperlinked within the VoiceXML specification). 
- in your example of two forms in a single document, the second form is
not executed in accordance with the form interpretation algorithm - i.e.
the first form is complete and there is no instruction inside it to go
another form, hence termination of execution.
- try-catch not following the scope rules of java/javascript/c++. While
the general model is similar, the 'as-if-by-copy' semantics is different
and was designed to fit with the variable scoping model used in form
interpretation algorithm. 
- "&amp;&amp;" (escaped '&&') for "and"; "||" for "or";  boolean
expressions in the format "inLove == true"; - these are ECMAScript
expressions, hence follow its rules. It is possible to write the
equivalence !a or (b and !c). I agree that XML and ECMAScript are not
the most compact formats.
Please let me know if I haven't answered your questions or addressed
your comments.



Co-chair, Voice Browser Working Group 

-----Original Message-----
From: www-voice-request@w3.org [mailto:www-voice-request@w3.org] On
Behalf Of Dennis de Champeaux
Sent: Monday, August 16, 2004 17:03
To: www-voice@w3.org
Cc: Dennis de Champeaux
Subject: comments and questions by a user of VXML

I used VXML to provide access by phone to: www.HealthCheck4Me.info The
functionality is now available and even a Dutch version may work.
VXML was however very painful.
Along the way I recorder gripes, questions and comments.  They range
from the microscopic to the philosophical, see below.

Dennis de Champeaux   OntoOO Inc  email: ddc@ontooo.com & ddc@acm.org
Page: 408 581 2185    Mesg: 408 559 7264
Address: 14519 Bercaw Ln, San Jose, CA 95124, USA

I am OK. You are OK?  If in doubt: www.HealthCheck4Me.info

File: c:/ddc/Memo/VXMLcomments.txt

- Why is it necessary to put a block inside a form?
  This looks bizarre:
         I only want to tell you I love you.
- Why is it not possible to use say-as inside a block?
  I.e. why does one has to wrap a prompt around a sentence containing
  a say-as as in:
         I owe you <say-as type=number> 100 </say-as> dollar.
- Given the need to use a form to do simple things (see above) why is
only one form executed in:
         I only want to tell you I love you.
         I only want to tell you I still love you.

- What is the purpose of block? paragraph? sentence?

- Why is try-catch not following the scope rules of java/

- Why do we have to write "&amp;&amp;" if we mean "and"?

- Why do we have to write "||" if we mean "or"?

- Why do we have to write the boolean expression:
  <if cond="inLove == true">
  if we simply mean: <if inLove>

- Is it possible to write the equivalence of say:
  !a or (b and !c) ??

[ The description of VXML by Hocek & Cuddihy is so bad that they must be
banned to the information author's gulag where forever they must repair
the holes in punch cards. ]

- TCL was the most ugly languge of the 90-ies.  VXML has now taken over.
The language appears not to have iteration (while, for) and no
recursion.  But it DOES have the goto primitive, which was banned by
Dijkstra 30 years ago.  There is no function abstraction and neither
object-oriented constructs.

- VXML is an interpreted language using Javascript.  Why not using only
Javascript with a bundle of speech specific predefined functions?
Hacking java-servlet code already entails generating HTML and
I don't see why we have to follow the same painful route with VXML.
We DO need a Javascript ONLY version of VXML!

- Woods created in the early 70-ies ATNs for NL processing.   He
separated the lexicon (in which the entries have syntactic and semantic
mark ups) from the parser that embodies acceptable syntax constructs and
which can create a "deep" semantic output.  VXML's grammar construct
appears to scratch only the surface of his ATNs (and is ill documented -
if documented at all).

- I don't see the beginnings of capturing dialog pragmatics.

- The VXML syntax borrows from XML, which borrows from HTML, which
borrows from ...  This syntax is OK as a mark up notation for text.
But for a programming language .... nah.  Just consider the availability
of a perfect alternative, which has been around since the late fifties:
LISP.  Instead of:
<vxml> ... </vxml>
one simply writes:
(vxml ... )
Moreover LISP is an interpreted language with fantastic semantics.
For example it has functions with arbitrary number of arguments.  This
allows for example to write:
(prompt I owe you (say-as (type number) 100) dollar.) [instead of:
         I owe you <say-as type=number> 100 </say-as> dollar.
It would even allow immediate use of Woods's ATNs ...
Hence we actually need a LISP ONLY version of VXML!
Received on Monday, 30 August 2004 18:34:44 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:14:26 UTC