Web Speech Working Group

W3C

The mission of the Web Speech Working Group, part of the Ubiquitous Web Applications Activity, is to build client-side speech APIs suitable for web environments like HTML5. The APIs will be designed to service the needs of both professionals building high-quality multi-modal user experiences and casual developers simply looking to add speech to their forms.

Join the Web Speech Working Group.

End date31 October 2014
ConfidentialityProceedings are public
Initial ChairsCHAIR INFO
Initial Team Contacts
(FTE %: 20)
TEAM CONTACT INFO
Usual Meeting ScheduleTeleconferences: Weekly
Face-to-face: 3-4 per year

Scope

For years, speech recognition (ASR) and synthesis (TTS) activities at the W3C have been taking place in the Voice Browser Working Group, which developed the VoiceXML suite of Recommendations. Thus far the W3C has not developed a way to integrate speech into the visual Web. In the last few years, the portion of Web enabled devices with microphones and speakers has increased dramatically due to the explosion of Web enabled mobile devices. It is time for W3C to bring speech input and output to the visual Web.

This Working Group is formed following the work of two earlier W3C groups:

  1. The HTML Speech Incubator Group gathered use cases and requirements as well as undertaking preliminary work on the HTML Speech API. For more information, see the HTML Speech Incubator Group Final Report.
  2. The Speech API Community Group, which continued discussions after the HTML Speech Incubator Group concluded. For more information, see the Speech JavaScript API draft.

Success Criteria

Deliverables

Recommendation Track Deliverables

The Working Group will deliver the following document:

Other Deliverables

The Working Group will publish JavaScript Speech Use Cases and Requirements as a Working Group Note. The Working Group will also develop a test suite for the JavaScript Speech API.

The Working Group may develop a primer, tutorial or other educational materials relating to speech and the Web.

The Working Group may submit change requests to the HTML Working Group for the purpose of enabling scenarios that take advantage of speech recognition and synthesis in HTML5. This might include, for example, access to speech recognition and synthesis capabilities through markup.

The Working Group may submit change requests to the Voice Browser Working Group for SRGS, SSML, SISR, VoiceXML 3, or other languages and specifications where appropriate for the purposes of consistency

The Working Group may submit change requests to the Multi-modal Interactions Working Group relating to EMMA

Milestones

Milestones
Note: The group will document significant changes from this initial schedule on the group home page.
Specification FPWD LC CR PR Rec
JavaScript Speech API Q1 2013 Q3 2013 Q4 2013 Q1 2014 Q2 2014

Timeline View Summary

The Working Group may be involved in Workshops, the details of which will be listed on the Working Groups home page as information becomes available.

Dependencies and Liaisons

Liaisons with W3C Groups

The Web Speech Working Group will request document reviews from the following groups:

HTML WG
The Working Group may submit change requests to the HTML Working Group for the purpose of enabling scenarios that take advantage of speech recognition and synthesis in HTML5. This might include, for example, access to speech recognition and synthesis capabilities through markup
Device APIs WG
The DAP Working Group on Media Capture APIs will be a critical component from which the speech recognition will be built.
Voice Browser WG
The Voice Browser WG produces specifications such as Speech Recognition Grammar Specification, Speech Synthesis Markup Language, Pronounciation Lexicon, and Semantic Interpretation for Speech Recognition which all will be relevant to the work of the group.
WebRTC
The WebRTC group may produce work that will influence how audio capture happens and is used by the recognition API.
Audio WG
The work of speech synthesis is likely to overlap with and complement the work being produced in the Audio Group.
Multimodal Interaction WG
The Multimodal Interaction Working Group produces Extensible Multimodal Annotations which will be an important part of the recognition API of this group.
HCG
The Hypertext Coordination Group will likely be interested in the work of the group.
Accessibility
The Web Content Accessibility Guidelines Working Group is likely to be interested in the work of this group.

Liaisons with External Groups

IETF
The Incubator Report included a preliminary Web Sockets protocol that is expected to be done at IETF.

Participation

To be successful, the Web Speech Working Group is expected to have 10 or more active participants for its duration. Effective participation to Web Speech Working Group is expected to consume one work day per week for each participant; two days per week for editors. The Web Speech Working Group will allocate also the necessary resources for building Test Suites for each specification.

Participants are reminded of the Good Standing requirements of the W3C Process.

Communication

This group primarily conducts its work during weekly teleconferences and on the public mailing list @@listname@@. There will also be a member-only mailing list which expected to be used only for confidential logistical information.

Information about the group (deliverables, participants, face-to-face meetings, teleconferences, etc.) is available from the Web Speech Working Group home page.

Decision Policy

As explained in the Process Document (section 3.3), this group will seek to make decisions when there is consensus. When the Chair puts a question and observes dissent, after due consideration of different opinions, the Chair should record a decision (possibly after a formal vote) and any objections, and move on.