W3C home > Mailing lists > Public > www-voice@w3.org > October to December 2003

Re: VoiceXML vs SALT

From: William S. Meisel <wmeisel@tmaa.com>
Date: Sun, 26 Oct 2003 14:19:41 -0800
Message-ID: <008101c39c0f$41267df0$6602a8c0@BILLDELL8100>
To: "'Ildar Gabdulline'" <ildar@realeastnetworks.com>, "Al Gilman" <asgilman@iamdigex.net>
Cc: <www-voice@w3.org>
I agree on the core differences between VoiceXML and SALT:

- Objective: SALT will encompass multimodal as well as pure telephony applications.
- Technical: SALT is an incremental approach that "tags" on to existing Web protocols (pun intended), while VoiceXML is being used largely as a standalone speech and touch-tone application language (by the way, many fielded VoiceXML applications are multimodal, if you count the keypad). 

However, a market reality is that most of the first SALT deployments will be voice-only, addressing standard telephones (where the market is today), and the Microsoft Speech Server (built on SALT and now in beta, to be released in the first half of next year) is targeted at exactly the same applications as VoiceXML. In practice, SALT and VoiceXML will be direct competitors in the marketplace. And, to confuse the issue further, VoiceXML can be used today in certain types of multimodal applications, such as sequential multimodality, where a request is made on a voice call and the requested data is delivered in a separate data call as text. 

- Bill Meisel
President, TMA Associates
Editor, Speech Recognition Update

  ----- Original Message ----- 
  From: Al Gilman 
  To: 'Ildar Gabdulline' 
  Cc: www-voice@w3.org 
  Sent: Sunday, October 26, 2003 9:22 AM
  Subject: RE: VoiceXML vs SALT

  >From: Ildar Gabdulline [mailto:ildar@realeastnetworks.com]

  >Could you please describe me - what are the differences between VoiceXML 
  >and SALT ?
  >As I understood for the moment both of them are used for the same purposes 
  >- programming of the dialogs.
  >Please clarify the situation, if it is possible.

  As Chris Royles almost said, it's simple.  VoiceXML is about designing
  dialogs, and SALT is not.

  SALT is about providing multi-modal richness to media events (display or
  input transactions) within Web dialogs.  It doesn't address the flow of a
  dialog, just the modal binding of one transaction at a time.  The dialog, to
  the extent that it is captured in any Web consensus format, is in SMIL or
  XForms or what have you.

  SALT and VoiceXML are two design points in the broad spectrum of multimodal
  interaction environments, or as the Device Independence community would term
  them, 'delivery contexts.'

  For the work that is going forward to create a continuum of capability
  subsuming the currently existing islands of capability, see the website
  of the W3C Multimodal Interaction activity


  ... and in particular the presentations in Session 2 of the last W3C
  Technical Plenary.

  [Find "Anywhere" in:]

  For past discussion on this question:



  At 11:02 AM 2003-10-26, Royles, Chris wrote:
  >Wow - that was a brave statement to make on the w3c forum.
  >Other people, especially those more involved may have a different slant on 
  >this than I do. Also I am not clear about what 'information' you are after 
  >or why you wish to make this comparison. But from standing on the 
  >sidelines, I see them as follows...
  >First, SALT and VxML cannot really be compared with each other on the same 
  >terms, they have different goals, although there are times when the 
  >technologies they use overlap.
  >  - VxML is a culmination of many different groups (Now the Voice Browser 
  > working group) bringing together different ideas to solve a problem, the 
  > VxML standard  is currently being developed by the W3C. The problem they 
  > are trying to solve, is to simplify the differeing proprietary 
  > technologies used for building IVR services and callflows, this was a big 
  > problem in the industry about 5 years ago. It was difficult to port one 
  > voice IVR solution to a different platform. VxML is typically interacted 
  > with, by the user over a telephone, it runs on the server and is used by 
  > the engine as a script for how to handle the call.
  >I use VxML daily to define services that execute within an IVR 
  >environment. The VxML browser built into the telephony application 
  >interprets the VxML and generates a resulting navigation through a 
  >service. The VxML is a small part of a much larger infrastructure 
  >supporting ASR, TTS and Call control facilities. The statement "Its major 
  >goal is to bring the advantages of web-based development and content 
  >delivery to interactive voice response applications" probably sums up the 
  >  - SALT is a standard introduced by another group 
  > (<http://www.saltforum.org/default.asp#FounderComments>http://www.saltforum.org/default.asp#FounderComments) 
  > of reasearch and industry experts, it provides 'integration' with 
  > existing markup languages such as HTML to provide new 'voice enabled', 
  > multimodel interfaces. I consider SALT as trying to 'introduce a new 
  > technology', in the form of multimodal interaction. The SALT language is 
  > typically interpreted on the client. An example would be downloading HTML 
  > content to view on your hand-held device, the browser rendering the HTML 
  > can also interpret SALT tags,  providing a voice interface to the dialogs 
  > and forms within the HTML content. This is a mulitmodal interface. I 
  > always consider SALT closer to SMIL than to VxML. I think the key 
  > statement is "The Speech Application Language Tags extend existing 
  > mark-up languages such as HTML, XHTML, and XML." 
  > <http://www.saltforum.org/default.asp#About%20SALT>http://www.saltforum.org/default.asp#About%20SALT
  >There is certainly scope for having both standards, as they approach 
  >different problems from different directions. In some cases the two 
  >technologies can be complementary. Having a SALT enabled multimodal dialog 
  >with rich visual content pushed to enhanced clients, while still providing 
  >standard voice only IVR interfaces using VxML.
  >Does anybody have any papers or reasearch that has studied this question? 
  >I always see scope for looking at the overlap between these two languages. 
  >It would be good to develop the definition of a dialog once, yet still be 
  >able to present that dialog as both VxML or SALT (or SMIL) based on the 
  >regards, Chris Royles
  >Vicorp                  Dr Christopher Royles
  >Wexham Springs          Senior Software Engineer
  >Framewood Road          +44 (0)1753 660 583
  >Wexham                  +44 (0)1753 660 501
  >SL3 6PJ                 chris.royles@vicorp.com
  >Great Britain           <http://www.vicorp.com>http://www.vicorp.com
  >-----Original Message-----
  >From: Ildar Gabdulline [mailto:ildar@realeastnetworks.com]
  >Sent: 26 October 2003 15:21
  >To: www-voice@w3.org
  >Subject: VoiceXML vs SALT
  >I am relatively new to voice dialogs.
  >Could you please describe me - what are the differences between VoiceXML 
  >and SALT ?
  >As I understood for the moment both of them are used for the same purposes 
  >- programming of the dialogs.
  >If this is correct then it seems that having two standard families is 
  >Please clarify the situation, if it is possible.
Received on Sunday, 26 October 2003 17:19:48 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:07:37 UTC