Re: [EXTERNAL] DRAFT: Letter to AT Vendors from Paul Grenier on 2023-01-30 (public-pronunciation@w3.org from January 2023)

From: Paul Grenier <pgrenier@gmail.com>
Date: Mon, 30 Jan 2023 10:00:14 -0500
Cc: Pronunciation Task Force <public-pronunciation@w3.org>
Message-ID: <CAMq9vGb4jRXsuZ6i2XiHZYLEtqweMiv_J4=4+GTuh=b5t4_egg@mail.gmail.com>
DRAFT 2:

The Pronunciation Task Force
<https://www.w3.org/WAI/APA/task-forces/pronunciation/> identified multiple
possible solutions for improving pronunciation on the web. We would like
your opinions about two strategies on our way to choosing our preferred
solution.

We aim to give authors control over pronunciation in HTML. Many
technologies will benefit from this innovation including smart speakers,
read aloud tools, and assistive technology (AT). Smart speakers can harvest
text and markup from the web and provide support for app developers to use
SSML <https://www.w3.org/TR/speech-synthesis11/>. Read aloud tools may use
the Web Speech API <https://wicg.github.io/speech-api/#tts-section> for
speech synthesis, and can utilize SSML in supported contexts.

When it comes to AT, we're presented with two possible strategies:

   1. AT will process pronunciation information from the accessibility tree
   (AxTree) provided by the browser. This should allow AT to implement
   enhanced pronunciation without significant changes to architecture. This
   approach requires work to map SSML-in-HTML to the AxTree and accessibility
   APIs.
   2. AT will parse the SSML-based pronunciation information from the DOM,
   directly. This approach is currently possible without additional work by
   others to support it. This may require significant changes in AT
   architecture.

Please let us know which approach you prefer for your products and the
users you serve. If you have any questions for our group, use our public
email: public-pronunciation@w3.org.


On Mon, Jan 9, 2023 at 10:48 AM Hakkinen, Mark T <mhakkinen@ets.org> wrote:

> Some comments:
>
>
>
> Second paragraph, third sentence:
>
>
>
> > Smart speakers harvest text and markup from the web  but also have an
> SSML
> <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis11%2F&data=05%7C01%7Cmhakkinen%40ets.org%7C31aba82755b0419458e008daf1ace042%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638088022080358251%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VC73qJRBKuWLKtTcPwnEf5XQZzCBRkGwDX1rmzKL12A%3D&reserved=0>
>  interface.
>
>
>
> I would suggest changing this to:
>
>
>
> “Smart speakers can harvest text and markup from the web and provide
> support for app developers to use SSML.”
>
>
>
> AFAIK, the smart speaker API have two modes, passing plain text, or pass
> fully formed SSML.
>
>
>
> Second paragraph, fourth sentence:
>
>
>
> > Read aloud tools may use the Web Speech API
> <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwicg.github.io%2Fspeech-api%2F%23tts-section&data=05%7C01%7Cmhakkinen%40ets.org%7C31aba82755b0419458e008daf1ace042%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638088022080358251%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bjeOpWqfuE1n%2FAQk5bhiiDatCmwwNRBQAwa09Cqud5k%3D&reserved=0>,
> also based on SSML.
>
>
>
> I would suggest the following change:
>
>
>
> “Read Aloud tools may use the Web Speech API for speech synthesis, and can
> utilize SSML if the requested synthesizer supports it.”
>
>
>
> Based on experience, the completeness of the SSML accepted varies by
> synthesizer.
>
>
>
> As for the two options:
>
>
>
> I am not clear on these.
>
>    1. Encourage AT to parse the SSML-based pronunciation information in
>    the same manner as other technologies.
>    2. Work with standards groups and browser vendors to add pronunciation
>    information to the accessibility tree (AxTree).
>
> Perhaps preface it with:
>
>
>
> Begin suggested text:
>
>
>
> There are two ways to include SSML-based pronunciation into HTML:
>
>
>
>    1. Inline SSML markup in HTML
>    2. Encode SSML-based properties as an attribute applied to text
>    container elements.
>
>
>
> Both approaches have advantages and disadvantages, but which ever approach
> is adopted, the question posed to AT developers (both screen reader and
> read aloud) specifically is how you will utilize the pronunciation
> information contained in HTML.  Which of the two options do you prefer:
>
>
>
>    1. We will parse the SSML-based pronunciation information directly
>    from the content (e.g., inline SSML or attribute-based SSML)
>    2. We would expect the browser vendors to add pronunciation
>    information directly to the accessibility tree (AxTree).”
>
> End suggested text.
>
> Based on the responses, we still don’t have a clear picture on which
> method, inline or attribute is easier for those vendors (AT or browser).
>
>
>
> Mark
>
>
>
>
>
> *From: *Paul Grenier <pgrenier@gmail.com>
> *Date: *Sunday, January 8, 2023 at 2:16 PM
> *To: *Pronunciation Task Force <public-pronunciation@w3.org>
> *Subject: *[EXTERNAL] DRAFT: Letter to AT Vendors
>
> *CAUTION: This email originated from outside of our organization. Do not
> click links or open attachments unless you recognize the sender and know
> the content is safe.*
>
> Audience: AT vendors and developers
>
> Subject: Pronunciation in HTML
>
> The Pronunciation Task Force
> <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FWAI%2FAPA%2Ftask-forces%2Fpronunciation%2F&data=05%7C01%7Cmhakkinen%40ets.org%7C31aba82755b0419458e008daf1ace042%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638088022080358251%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KQSS02TyYPBHHrKMP3kSR%2FLbT1ibIZ0Q4pM%2FwSH63kk%3D&reserved=0> identified
> multiple possible solutions for improving pronunciation on the web. We
> would like your opinions about two strategies on our way to choosing our
> preferred solution.
>
> We aim to give authors control over pronunciation in HTML. Many
> technologies will benefit from this innovation including smart speakers,
> read aloud tools, and assistive technology (AT). Smart speakers harvest
> text and markup from the web but also have an SSML
> <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis11%2F&data=05%7C01%7Cmhakkinen%40ets.org%7C31aba82755b0419458e008daf1ace042%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638088022080358251%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VC73qJRBKuWLKtTcPwnEf5XQZzCBRkGwDX1rmzKL12A%3D&reserved=0> interface.
> Read aloud tools may use the Web Speech API
> <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwicg.github.io%2Fspeech-api%2F%23tts-section&data=05%7C01%7Cmhakkinen%40ets.org%7C31aba82755b0419458e008daf1ace042%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638088022080358251%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bjeOpWqfuE1n%2FAQk5bhiiDatCmwwNRBQAwa09Cqud5k%3D&reserved=0>,
> also based on SSML. When it comes to AT, we're presented with two possible
> strategies:
>
>    1. Encourage AT to parse the SSML-based pronunciation information in
>    the same manner as other technologies.
>    2. Work with standards groups and browser vendors to add pronunciation
>    information to the accessibility tree (AxTree).
>
> Please let us know which approach you prefer for your products and the
> users you serve. If you have any questions for our group, use our public
> email: public-pronunciation@w3.org.
>
> ------------------------------
>
> This e-mail and any files transmitted with it may contain privileged or
> confidential information. It is solely for use by the individual for whom
> it is intended, even if addressed incorrectly. If you received this e-mail
> in error, please notify the sender; do not disclose, copy, distribute, or
> take any action in reliance on the contents of this information; and delete
> it from your system. Any other use of this e-mail is prohibited.
>
> Thank you for your compliance.
> ------------------------------
>
Received on Monday, 30 January 2023 15:00:40 UTC