[voiceinteraction] fyi Specification for Spoken Presentation in HTML

Some of you may be interested in this new publication on pronunciation in
HTML, which was just published by the Accessibility Working Group at the W3C
https://www.w3.org/TR/spoken-html/.

Here's the abstract.

Abstract
Accurate pronunciation by text-to-speech (TTS) synthesis is very important
in many contexts, and critical in education, publishing, communication,
entertainment, among other domains. TTS has become an important technology
for providing access to digital content on the web. Yet there is no way to
markup content today that will correctly present TTS generated output across
commonly used TTS engines and operating environments.

We identify two markup approaches in this publication to give content
authors reliable pronunciation of HTML content regardless of the operating
environment (or assistive technology) users might choose to use. Each
approach has been demonstrated to yield consistent results. We seek feedback
from authors and implementors to help determine which approach should be
advanced to normative recommendation status by W3C.

We base each candidate approach on a subset of Speech Synthesis Markup
Language (SSML). Our selected subset is carefully chosen to bring
consistency and predictability to spoken presentation across a full range of
assistive technologies and operating environments. Both technical approaches
described in this publication carefully avoid the impasse that has prevented
SSML from becoming a native HTML technology and should, therefore, be
generally applicable. Either approach described here satisfies our
requirements for assistive technologies and will be useful to voice
assistants which consume and present HTML content in spoken form. We seek
feedback on which approach would prove most implementable across all
applications of spoken presentation of web content.

Received on Wednesday, 19 May 2021 18:39:27 UTC