Intent to incubate Speech API from Marcos Caceres on 2019-10-03 (public-speech-api@w3.org from October 2019)

From: Marcos Caceres <marcos@marcosc.com>
Date: Thu, 3 Oct 2019 20:40:07 +1000
To: public-speech-api@w3.org
Cc: Andre Natal <anatal@mozilla.com>
Message-Id: <C9AB9FE7-9A1D-4CFD-81BF-7DDF1EE67885@marcosc.com>

Dear Speech CG members,

When we first embarked on the standardization of the Speech API nearly a decade ago, this group foresaw the rise and relevance that speech recognition and synthesis would make in the lives of people: Siri, Alexa, Cortana, and Google's voice assistant became an indispensable part of the computing experience. However, due to lack of interoperability and implementer interest, we are still yearning for the Web to take full advantage of speech technology.

In creating the Speech API, we made some assumptions about how the speech API would be used that did not fully come to fruition. As such, we've seen limited implementation of the API, which in turn led to limited usage on the Web. An internal survey of archive.org found that only around 4000 sites were using the Speech API in 2019. Although some sites are prominent (e.g., the Google homepage!), uptake overall remains negligible.

As speech recognition and synthesis technology increasingly becomes central to all computing, Mozilla believes now is the time to rekindle the effort to bring the Speech API to browsers. Much has changed for the web platform since this effort began: we now have an extensive range of new architectural primitives in the web platform, much improved speech recognition technology (including free and open speech recognition samples and models). And, more importantly, we also have a deeper understanding of the privacy and security implications, accessibility challenges, and internationalization concerns, of what we are trying to standardize.

With the benefit of hindsight, Mozilla would like the opportunity to restart this effort under the W3C's Web Incubation Community Group (WICG) - as an incubation, with the ultimate aim of W3C standardization. WICG is an active venue for developers and implementers with an extensive track record of successful incubations, allowing us to involve experts from a range of communities.

To move the specification forward, we'd like to work together with this and the wider web standards community to revise the existing specification. We'd like to review what worked (i.e., what got implemented), what didn't, and how we can make the API better to best serve users and the developer community.

Concrete steps:

• Move spec to the WICG - (re)invite implementers and the community to participate: something we are already doing in the GitHub repository.
• Update/modify/remove parts of the spec that were not implemented or cannot be implemented in an interoperable manner (active work has been happening on this recently).
• Address long standing privacy and security issues.
• Evaluate where the API can be improved.
• Write out the algorithms that would afford us interoperability: right now, the spec lacks any algorithms, making it difficult to evaluate interoperable behavior.
• Create an extensive test suite, which would assure both the quality of the specification and the interoperability of implementations.

Hope you will come join us on GitHub to make the Speech API a success!
https://github.com/w3c/speech-api

Kind regards,
Marcos Caceres and Andre Natal, Mozilla

Received on Thursday, 3 October 2019 10:40:41 UTC