RE: [EXTERNAL] DRAFT: Letter to AT Vendors from Dyer, Dee on 2023-01-30 (public-pronunciation@w3.org from January 2023)

From: Dyer, Dee <ddyer@ets.org>
Date: Mon, 30 Jan 2023 16:09:58 +0000
To: Paul Grenier <pgrenier@gmail.com>, "public-pronunciation@w3.org" <public-pronunciation@w3.org>
CC: Pronunciation Task Force <public-pronunciation@w3.org>
Message-ID: <MN2PR07MB6799BA1CAC0D48CDD14DE93DA2D39@MN2PR07MB6799.namprd07.prod.outlook.com>
Hi All,

I reread the letter and have made some suggestions as follows.


The Pronunciation Task Force<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FWAI%2FAPA%2Ftask-forces%2Fpronunciation%2F&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464016939%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Up4003Am6%2FqFaCMRd%2BDK%2FLzkM6BO%2BB8xFTnrcnX6R48%3D&reserved=0> identified multiple possible solutions for improving pronunciation on the web. We want your opinions about two strategies for choosing a preferred solution.

We aim to give authors control over pronunciation in HTML. This innovation will benefit many technologies, including smart speakers, read-aloud tools, and assistive technology (AT). Smart speakers can harvest text and markup from the web and provide support for app developers to use SSML<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis11%2F&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464016939%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hQFZmZ4PUZIYycxiX8IjJo%2B%2BNgy86Kxi2S1k2ZKwE4s%3D&reserved=0>. Read-aloud tools may use the Web Speech API<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwicg.github.io%2Fspeech-api%2F%23tts-section&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464016939%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YockNOupi7r4nVoenpOU%2BF212R8CkqdpCO643qrfmIA%3D&reserved=0> for speech synthesis and can utilize SSML in supported contexts.

When it comes to AT, we have two possible strategies:

  1.  AT will process pronunciation information from the accessibility tree (AxTree) provided by the browser. This should allow AT to implement enhanced pronunciation without significant changes to architecture. This approach requires work to map SSML-in-HTML to the AxTree and accessibility APIs.
  2.  AT will parse the SSML-based pronunciation information from the DOM, directly. This approach is currently possible without additional work by others to support it. This may require significant changes in AT architecture.
Please let us know which approach you prefer for your products and the users you serve. If you have any questions for our group, use our public email: public-pronunciation@w3.org<mailto:public-pronunciation@w3.org>.

Thank you,


[The E T S Logo]
Dee Dyer (she, her, hers)
Assessment Process Specialist
Accessible Content & Inclusive Solutions (ACIS)
Technology, Accessibility, and Innovation
ETS | Assessment and Learning Technology Research & Development
660 Rosedale Road, Princeton, NJ 08541
Tel:609-683-2127<tel:609-683-2127>
Email: ddyer@ets.org<mailto:ddyer@ets.org>
[The Teams Logo]Please chat with me on Teams!<sip:ddyer@ets.org>

From: Paul Grenier <pgrenier@gmail.com>
Sent: Monday, January 30, 2023 10:00 AM
To: public-pronunciation@w3.org
Cc: Pronunciation Task Force <public-pronunciation@w3.org>
Subject: Re: [EXTERNAL] DRAFT: Letter to AT Vendors

DRAFT 2:

The Pronunciation Task Force<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FWAI%2FAPA%2Ftask-forces%2Fpronunciation%2F&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464016939%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Up4003Am6%2FqFaCMRd%2BDK%2FLzkM6BO%2BB8xFTnrcnX6R48%3D&reserved=0> identified multiple possible solutions for improving pronunciation on the web. We would like your opinions about two strategies on our way to choosing our preferred solution.

We aim to give authors control over pronunciation in HTML. Many technologies will benefit from this innovation including smart speakers, read aloud tools, and assistive technology (AT). Smart speakers can harvest text and markup from the web and provide support for app developers to use SSML<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis11%2F&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464016939%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hQFZmZ4PUZIYycxiX8IjJo%2B%2BNgy86Kxi2S1k2ZKwE4s%3D&reserved=0>. Read aloud tools may use the Web Speech API<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwicg.github.io%2Fspeech-api%2F%23tts-section&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464016939%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YockNOupi7r4nVoenpOU%2BF212R8CkqdpCO643qrfmIA%3D&reserved=0> for speech synthesis, and can utilize SSML in supported contexts.

When it comes to AT, we're presented with two possible strategies:

  1.  AT will process pronunciation information from the accessibility tree (AxTree) provided by the browser. This should allow AT to implement enhanced pronunciation without significant changes to architecture. This approach requires work to map SSML-in-HTML to the AxTree and accessibility APIs.
  2.  AT will parse the SSML-based pronunciation information from the DOM, directly. This approach is currently possible without additional work by others to support it. This may require significant changes in AT architecture.

Please let us know which approach you prefer for your products and the users you serve. If you have any questions for our group, use our public email: public-pronunciation@w3.org<mailto:public-pronunciation@w3.org>.


On Mon, Jan 9, 2023 at 10:48 AM Hakkinen, Mark T <mhakkinen@ets.org<mailto:mhakkinen@ets.org>> wrote:
Some comments:

Second paragraph, third sentence:

> Smart speakers harvest text and markup from the web  but also have an SSML<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis11%2F&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464016939%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hQFZmZ4PUZIYycxiX8IjJo%2B%2BNgy86Kxi2S1k2ZKwE4s%3D&reserved=0> interface.

I would suggest changing this to:

"Smart speakers can harvest text and markup from the web and provide support for app developers to use SSML."

AFAIK, the smart speaker API have two modes, passing plain text, or pass fully formed SSML.

Second paragraph, fourth sentence:

> Read aloud tools may use the Web Speech API<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwicg.github.io%2Fspeech-api%2F%23tts-section&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464016939%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YockNOupi7r4nVoenpOU%2BF212R8CkqdpCO643qrfmIA%3D&reserved=0>, also based on SSML.

I would suggest the following change:

"Read Aloud tools may use the Web Speech API for speech synthesis, and can utilize SSML if the requested synthesizer supports it."

Based on experience, the completeness of the SSML accepted varies by synthesizer.

As for the two options:

I am not clear on these.

  1.  Encourage AT to parse the SSML-based pronunciation information in the same manner as other technologies.
  2.  Work with standards groups and browser vendors to add pronunciation information to the accessibility tree (AxTree).
Perhaps preface it with:

Begin suggested text:

There are two ways to include SSML-based pronunciation into HTML:


  1.  Inline SSML markup in HTML
  2.  Encode SSML-based properties as an attribute applied to text container elements.

Both approaches have advantages and disadvantages, but which ever approach is adopted, the question posed to AT developers (both screen reader and read aloud) specifically is how you will utilize the pronunciation information contained in HTML.  Which of the two options do you prefer:


  1.  We will parse the SSML-based pronunciation information directly from the content (e.g., inline SSML or attribute-based SSML)
  2.  We would expect the browser vendors to add pronunciation information directly to the accessibility tree (AxTree)."

End suggested text.

Based on the responses, we still don't have a clear picture on which method, inline or attribute is easier for those vendors (AT or browser).



Mark


From: Paul Grenier <pgrenier@gmail.com<mailto:pgrenier@gmail.com>>
Date: Sunday, January 8, 2023 at 2:16 PM
To: Pronunciation Task Force <public-pronunciation@w3.org<mailto:public-pronunciation@w3.org>>
Subject: [EXTERNAL] DRAFT: Letter to AT Vendors
CAUTION: This email originated from outside of our organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Audience: AT vendors and developers

Subject: Pronunciation in HTML

The Pronunciation Task Force<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FWAI%2FAPA%2Ftask-forces%2Fpronunciation%2F&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464173174%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1LhjxWfxpqRDw%2F2YXUrsLYbt%2BdlISIf0mVw0wWJWsPA%3D&reserved=0> identified multiple possible solutions for improving pronunciation on the web. We would like your opinions about two strategies on our way to choosing our preferred solution.

We aim to give authors control over pronunciation in HTML. Many technologies will benefit from this innovation including smart speakers, read aloud tools, and assistive technology (AT). Smart speakers harvest text and markup from the web but also have an SSML<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fspeech-synthesis11%2F&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464173174%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=kAR4Rtd2QX5tYpQ0QQsRIWC4%2BSK%2FmCPIKYwvj%2Bwxa%2Bk%3D&reserved=0> interface. Read aloud tools may use the Web Speech API<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwicg.github.io%2Fspeech-api%2F%23tts-section&data=05%7C01%7Cddyer%40ets.org%7C201d752498eb4d1621b808db02d2c25d%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C638106876464173174%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WWHcVhe6K43ETOIg8eO6C89fKpT2uKKMT%2BeRageHWXk%3D&reserved=0>, also based on SSML. When it comes to AT, we're presented with two possible strategies:

  1.  Encourage AT to parse the SSML-based pronunciation information in the same manner as other technologies.
  2.  Work with standards groups and browser vendors to add pronunciation information to the accessibility tree (AxTree).

Please let us know which approach you prefer for your products and the users you serve. If you have any questions for our group, use our public email: public-pronunciation@w3.org<mailto:public-pronunciation@w3.org>.

________________________________

This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited.


Thank you for your compliance.

________________________________

________________________________

This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited.


Thank you for your compliance.

________________________________
Attachments

image/png attachment: image001.png
image/png attachment: image002.png
Received on Monday, 30 January 2023 16:10:19 UTC