News Release: World Wide Web Consortium Issues Speech Synthesis Markup Language (SSML) 1.0 as a W3C Recommendation from Susan Lesch on 2004-09-08 (w3c-news@w3.org from July to September 2004)

From: Susan Lesch <lesch@w3.org>
Date: Wed, 8 Sep 2004 09:04:49 -0500
To: w3c-news@w3.org
Message-Id: <0C215565-01A0-11D9-A783-000A95D04A34@w3.org>
Today, the World Wide Web Consortium (W3C) published the Speech 
Synthesis Markup Language (SSML) 1.0 as a W3C Recommendation. SSML 1.0, 
a fundamental specification in the W3C Speech Interface Framework, 
elevates the role of high-quality synthesized speech in Web 
interactions. Application designers for mobile phones, personal digital 
assistants (PDAs), and a host of emerging technologies use SSML 1.0 to 
achieve both coarse- and fine-grain control of important aspects of 
speech synthesis, including pronunciation, volume, and pitch. Like its 
companion W3C Recommendations VoiceXML 2.0 and Speech Recognition 
Grammar Specification (SRGS) published by the W3C Voice Browser Working 
Group, SSML 1.0 is built for integration with other Web technologies 
and to promote interoperability across different synthesis-capable 
platforms.

For more information, please contact Karen Myers, W3C Media Relations 
Manager, at +1.617.253.5884 or +1.978.502.6218 (karen@w3.org) or 
contact the W3C Communications representative in your region, listed at 
the bottom of this email.

===============================================================

World Wide Web Consortium Issues SSML 1.0 as a W3C Recommendation

High-Quality Synthesized Speech Bolsters Speech Interface Framework

Web Resources:

This press release:
     In English: http://www.w3.org/2004/09/ssml-pressrelease.html.en
     In French: http://www.w3.org/2004/09/ssml-pressrelease.html.fr

Testimonials:
     http://www.w3.org/2004/09/ssml-testimonial.html

Voice Browser Activity:
     http://www.w3.org/Voice/

http://www.w3.org/ -- 8 September 2004 -- Strengthening the voice of 
the Web, the World Wide Web Consortium (W3C) has published the Speech 
Synthesis Markup Language (SSML) 1.0 as a W3C Recommendation. SSML 1.0, 
a fundamental specification in the W3C Speech Interface Framework, 
elevates the role of high-quality synthesized speech in Web 
interactions. Application designers for mobile phones, personal digital 
assistants (PDAs), and a host of emerging technologies use SSML 1.0 to 
achieve both coarse- and fine-grain control of important aspects of 
speech synthesis, including pronunciation, volume, and pitch. Like its 
companion W3C Recommendations VoiceXML 2.0 and Speech Recognition 
Grammar Specification (SRGS) published by the W3C Voice Browser Working 
Group, SSML 1.0 is built for integration with other Web technologies 
and to promote interoperability across different synthesis-capable 
platforms.

"I am excited about the progress the Voice Browser Working Group has 
made in providing improved access to services over the telephone 
through the use of Web technologies," said W3C Director Tim 
Berners-Lee, who will be delivering a keynote address at the SpeechTEK 
Conference next week. He added, "Companies can now offer Web access to 
their customers via the telephone as well as from a personal computer."

Aimed at the world's estimated two billion fixed line and mobile 
phones, W3C's Speech Interface Framework -- a collection of 
specifications for building voice applications for the Web -- will 
allow an unprecedented number of people to use any telephone to 
interact with appropriately designed Web-based services via key pads, 
spoken commands, listening to pre-recorded speech, synthetic speech and 
music.

A World Wide Web Consortium (W3C) Recommendation is understood by 
industry and the Web community at large as a Web standard. Each 
Recommendation is a stable specification developed by a W3C Working 
Group and reviewed by the W3C Membership. Recommendations promote 
interoperability of Web technologies by explicitly conveying the 
industry consensus formed by the Working Group.

A Rich Vocabulary for High-Quality Speech

One of the primary challenges to strengthening the voice of the Web 
that SSML addresses is pronunciation. For example, how do you pronounce 
"1/2"? The SSML 1.0 specification uses this simple example to 
illustrate some of the challenges of turning general purpose text into 
meaningful synthesized speech. Without additional context, one would 
not know whether to say "one half" or "January second" or "February 
first" or "one divided by two". SSML 1.0 constructs help eliminate this 
sort of ambiguity. The SSML vocabulary allows word-level, 
phoneme-level, and even waveform-level control of the output to satisfy 
a wide spectrum of application scenarios and authoring requirements.

"SSML builds on the work of the pioneers in speech synthesis to provide 
application developers with a powerful and flexible means to deliver a 
high quality mix of synthetic and pre-recorded speech as part of 
interactive voice response services," said Dave Raggett, Activity Lead 
for W3C's work on voice browsers, and a W3C Fellow from Canon. He 
added, "SSML allows VoiceXML-based services to be accessed via 
textphones for people with speaking or hearing impairments. In 
addition, SSML has great promise beyond its use with VoiceXML, as we 
look forward to emerging standards for multimodal interaction."

Like XHTML, SSML is a markup language based on the widely deployed XML 
standard. SSML content can stand alone or be included in other XML 
content in order to improve rendering as synthesized speech. Naturally, 
SSML is particularly well-suited for use with a VoiceXML wrapper when 
building an interactive voice response application.

SSML 1.0 is built for Web integration in other ways as well. The Voice 
Browser Working Group worked closely with other W3C groups to ensure 
that the design of SSML 1.0 is consistent with principles of 
accessibility, internationalization, and general Web architecture. 
Indeed, one important application of SSML involves "text phones" that 
may be used by people with some hearing disabilities. The same content 
can also be output as speech through a common telephone. SSML 1.0 is 
also consistent with previous work at W3C on describing pronunciation 
with Cascading Style Sheets (CSS). W3C's CSS Working Group is 
developing a speech module in CSS3 for rendering XML documents with 
SSML-based speech engines.

Early Industry Adoption

W3C's Voice Browser Working Group has been particularly successful at 
ensuring adoption of its specifications before they reach 
Recommendation status. A test suite (discussed in the July 2004 SSML 
implementation report) has helped ensure consistent behavior and 
quality among the already numerous implementations of SSML 1.0. Vendors 
that have already implemented SSML 1.0 and that are participating in 
Working Group include: Aspect Communications, France Telecom, 
Hewlett-Packard, IBM, Loquendo, Microsoft, MITRE, Nuance 
Communications, SAP, ScanSoft, Sun Microsystems, VoiceGenie 
Technologies, Voxeo, and Voxpilot.

The Working Group will now focus its energies on the remainder of the 
Speech Framework. "After VoiceXML 2.0 and Speech Recognition Grammar 
Specification (SRGS), SSML is the third language of the W3C Speech 
Interface Framework to become a full W3C Recommendation," said Jim 
Larson, manager, advanced human input/output, for Intel and also 
co-chair of W3C's Voice Browser Working Group. "We are working to 
complete work on other languages of the W3C Speech Interface Framework, 
including VoiceXML 2.1, Semantic Interpretation, and the Call Control 
eXtensible Markup Language (CCXML)."

The Working Group is among the largest and most active in W3C. Its 
participants include: Aspect Communications, BeVocal, Brooktrout 
Technology, Canon, Comverse Technology, Convedia, Electronic Data 
Systems, France Telecom, Genesys Telecommunications Laboratories, 
HeyAnita, Hitachi, Hewlett-Packard, IBM, Intel, IWA-HWG, Korea 
Association of Information and Telecommunication, Loquendo, Microsoft, 
MITRE, Mitsubishi Electric, Motorola, Nokia, Nuance Communications, 
Openstream, SAP, ScanSoft, Siemens, Sun Microsystems, Syntellect, 
Tellme Networks, Verascape, Vocalocity, VoiceGenie Technologies, Voxeo, 
and Voxpilot.

About the World Wide Web Consortium (W3C)

The W3C was created to lead the Web to its full potential by developing 
common protocols that promote its evolution and ensure its 
interoperability. It is an international industry consortium jointly 
run by the MIT Computer Science and Artificial Intelligence Laboratory 
(MIT CSAIL) in the USA, the European Research Consortium for 
Informatics and Mathematics (ERCIM) headquartered in France and Keio 
University in Japan. Services provided by the Consortium include: a 
repository of information about the World Wide Web for developers and 
users, and various prototype and sample applications to demonstrate use 
of new technology. To date, nearly 400 organizations are Members of the 
Consortium. For more information see http://www.w3.org/

Contact America --
     Karen Myers, <karen@w3.org>, +1.617.253.5884 or +1.978.502.6218
Contact Europe --
     Marie-Claire Forgue, <mcf@w3.org>, +33.492.38.75.94
Contact Asia --
     Yasuyuki Hirakawa <chibao@w3.org>, +81.466.49.1170

###
Received on Wednesday, 8 September 2004 14:04:52 UTC