Minutes for TAG & APA meeting from Matthew Atkinson on 2020-10-19 (public-apa-admin@w3.org from October 2020)

From: Matthew Atkinson <matkinson@paciellogroup.com>
Date: Mon, 19 Oct 2020 09:34:16 +0000
To: Accessible Platform Architectures Administration <public-apa-admin@w3.org>
Message-ID: <DM6PR20MB3377DAFA11C99E3D83A48503BD1E0@DM6PR20MB3377.namprd20.prod.outlook.com>

Hello APA,

Apologies for the delay; draft minutes for our joint meeting with the TAG from Friday (the 16th) can be found online at <https://www.w3.org/2020/10/16-apa-minutes.html> and are repeated below for convenience.

Best regards,

Matthew

- DRAFT -

Accessible Platform Architectures Working Group Teleconference

16 Oct 2020

Attendees

Present
Appelquist, Dan, Irfan, Joshue108_, Matthew_Atkinson, MichaelC,
NeilS, Rossen_, SteveNoble, becky, hober, janina, jasonjgw,
mhakkinen, paul_grenier
Regrets
Chair
Janina
Scribe
Matthew_Atkinson

Contents

* Topics <#agenda>
* Summary of Action Items <#ActionSummary>
* Summary of Resolutions <#ResolutionSummary>

------------------------------------------------------------------------

<janina> trackbot, start meeting

<Irfan> Scribe: Matthew_Atkinson

Janina: We are at an impasse whereby we know how to solve the
pronunciation problem for accessibility, but there is potential to solve
the problem in the mainstream sense, for e.g. personal digital
assistants, which would require some contribution from WHATWG. The
purpose of the meeting is to present both paths and gauge interest from
the wider community and decide if the accessibility-focused or more
widely-applicable approach s[CUT]
... We could move forward on publishing the accessibility path soon; the
mainstream path would take longer, but seems like a better outcome
overall, as accessibility would be part of the mainstream solution.

<Irfan> https://w3c.github.io/pronunciation/explainer
<https://w3c.github.io/pronunciation/explainer>

Janina: There is also a video that explains the work done on pronunciation.

<Irfan> https://www.w3.org/2020/10/TPAC/apa-pronunciation.html
<https://www.w3.org/2020/10/TPAC/apa-pronunciation.html>

<becky> Pronunciation video:
https://www.w3.org/2020/10/TPAC/apa-pronunciation.html
<https://www.w3.org/2020/10/TPAC/apa-pronunciation.html>

Janina: tl;dr: we would be asking WHATWG to allow a portion of SSML into
HTML so that UAs that consume it would be able to access it and expose
it to ATs/TTS.

Paul: Two approaches; one possible to implement today; using data-*
attributes we can pack a JSON repr of SSML into HTML which can then be
unpacked by UAs. There's a PoC for Macs (which don't support SSML) to
demo the functionality there.

<paul_grenier>
https://www.w3.org/TR/pronunciation-gap-analysis-and-use-cases/#gap-analysis
<https://www.w3.org/TR/pronunciation-gap-analysis-and-use-cases/#gap-analysis>

Paul: The other option is to promote SSML as a first-class citizen in HTML.
... SSML differs from other options as detailed by the doc linked above.
... Once decided on implementation path, need to liaise with browser and
AT vendors to establish how the inof will be exposed in the AX tree.

Janina: AT vendors have been clear about not wanting to have to parse
the HTML themselves.

Rossen: is this related to the TAG design review issue (476)?

Janina: there's a specific request from Personalization to reserve a
prefix. That's a different spec and use-case, regarding supporting users
with cognitive disabilities. It is around providing support for
presenting content using different symbol sets (there's a
Personalization video too). Thus we may end up with one prefix for
Personalization and one for Pronunciation.

<paul_grenier> (another "pollyfill" example using custom elements:
https://ssml-components.glitch.me/ <https://ssml-components.glitch.me/>

Rossen: is there a TAG issue we're currently discussing?

Becky: Issue 46 under meetings.

<becky> https://github.com/w3ctag/meetings/issues/46
<https://github.com/w3ctag/meetings/issues/46>

Janina: We're trying to determine if it's reasonable to request specific
markup in order to support this use-case, or if this is premature.

Dan: SSML v1.1 is the latest (2010; XML-based). What would need to be
done in order to integrate it into HTML?

Janina: This has already been done for SVG and MathML and this would be
similar. Parsers that can consume them do so; others skip them. The
markup would apply to e.g. a span in the content. The video
Pronunciation produced features some code samples.

<dka_> Ref mathml, there has been some work on modernizing mathml
(mathml core) which you may want to reference.
https://mathml-refresh.github.io/mathml-core/
<https://mathml-refresh.github.io/mathml-core/>

Tess: Would the proposal require HTML processors to be aware of new
elements, or just new attributes?

Paul: There are some overlaps with SSML tags, but for the most part,
since none of the SSML information is visual, we don't expect problems
integrating with current HTML processing—should be easier than existing
SVG/MathML integration.

Tess: Parser changes for both SVG and MathML are substantial—elements
require parser changes; attrs don't.
... parser changes are generally avoided (consider <template>—the main
change in the past decade). Reticence comes due to potential security
issues.

Becky: Given that speech is becoming a more significant means of
interaction, should we take the broader approach now, not for
accessibility alone, but more widely applicable?

Janina: Acknowledged the need to avoid adding elements to the parser for
anything trivial; we feel this modality is non-trivial, gaining in
relevance.
... *gives some examples around classic issues* "cd" could be "change
directory", "compact disc", "candellas", "certificate of deposit"
[scribe: may have that last one wrong]

Dan: Still some significant issues that need to be untangled. Usage
patterns are not yet very clear.
... e.g. multimodal input

Janina: So far we've looked just at output.

Mark: Educational tech sector. Long-term issue of students needing
content being read correctly to them. This is acutely important in the
current climate. Need to make sure that the students' devices pronounce
things as the teacher would, but also convey prossidy correctly too.
There are a lot of hacks that have been developed in the industry. We
would like to create a non-hacky way to do this. Concern that as time
goes on, appro[CUT]
... It's important to accellerate some sort of solution before it's too
late for vendors.

Tess: How do we define "too late for vendors"—is there a particular
timeframe?

Mark: it seems that different vendors will create and pursuing their own
solutions, which will cause fragmentation.
... IMS Global Learning Consortium develops standards (e.g. QTI, used
for testing) allows people to embed SSML. What happens when these get
transformed into HTML for delivery?

Tess: Seems like major goal is to help personal assistants improve their
pronunciation of web content. In addition to engaging browser vendors
wrt implementability and fitness, personal assistant vendors should be
consulted too, to encourage successful results.
... Amazon, Google and Apple are W3C members; are they involved? Would
this TPAC be an opportunity to seek input?

Rossen: Some contacts in CSS.

Mark: we have some contacts also [scribe: didn't catch group name]

Tess: Need to be able to justify parser changes—if digital assistant
teams are on-board with this, this makes the argument more compelling.

Paul: If we were to put SSML into an HTML doc currently, it wouldn't
validate, but the browser would ignore the unknown tags. This could be
done in stages...
... content that's already ready to have customisations could be parsed
specifically by, e.g. a smart speaker.
... for screen readers, maybe specific tools (e.g. ChromeVox) could do
their own parsing.
... ultimately we need to have the speech semantics exposed in the AX
tree but could develop a separate one in the meantime?
... A lot of emphasis on this technology is coming from authors' desire
to have content announced correctly, but if we could enocourage vendors
to be involved this would help.

Tess: There are some significant implementation issues, e.g. SSML has a
 element. If this was copied into an HTML doc, and the HTML parser
found the element it would think it was an HTML element.
(*Considerations around namespacing and the structure, e.g. with
<table>s were also described*)
... MathML was mostly able to avoid this due to being designed to be
embedded in HTML; all of its elements start with 'm' (e.g.)
... however MathML did require a number of parser changes.
... As Paul suggested, a polyfill could work in some cases, but may not
be possible to guarantee it will behave as a future native
implementation would, a subset profile of SSML would have to be used
(probably not including elements e.g.)

Paul: We have control over how elements are presented visually, but not
aurally. E.g. can override styles for . Not expecting the parsing
differences to affect the aural nature of the SSML. Expect that devs
knowing that it's not (yet) native, they could use styling and extra
accessibility attrs to provide more info for the AX tree.

Janina: Do we need _all_ of SSML 1.1 or is there a subset/profile that
meets our needs, as we develop the wider case?

Tess: The profile wouldn't necessarily have to be a big change; just
minimize parser clashes. Would need to check, but it could be _just_ the
 element for example.

Mark: in the educational context we often use subsets—certainly
achievable to try the same approach here.

Paul: elements include , <s>, (different meaning in SSML/HTML).
Some elements mean simlar things across both languages.
... woudl have to study which elements are already close semantically.

Mark: substitution (i.e. ) is a heavily-used feature in educational
settings.

Paul: Some AT have tried to establish whether / make a
difference, for example. This would separate the visual/aural and allow
more control for authors, rather than general rules that have been
developed over time to try to provide some aural customisation for
content authors.

Rossen: are any other working groups involved with the Pronunciation TF?

Paul: Léonie suggested early on that this shouldn't be part of ARIA as
it has a wider remit than AT.

Mark: *+1*

Rossen: Some tools such as reading aloud can get the info they need from
the AX tree.

Tess: as well as getting personal assistants involved, people who work
on read aloud on platforms like e-readers could be interested too.
... Publishing has contacts.

Janina: reaching a conclusion today isn't practicable, but we have a lot
of suggestions, contacts and approaches. We should pursue those and then
re-assess later.

Mark: been discussing with EPUB (who've implemented a form of SSML,
which is used in Japan when generating audio files from EPUB).

Tess: Sounds like useful prior art.

Mark: this was a namespaced attribute model.

Tess: That's a lot easier to implement than elements.

Neil: Noticed that Amazon supports SSML for Alexa skills—and they have a
lot of documentation on this.

Mark: *is developing a skill that demos some issues; this is featured in
the Pronunciation video*
... a lot of skills pick up content from Wikipedia. Woudl be great to be
able to pass through the pronunciation info from that content.

Janina: what might our next steps be?

Rossen: One clear action is to engage with the community that works on
digital assistants and similar devices and guage their interest in the
space. This should provide a lot of additional prior art and/or
opinions, and could generate a lot of interest.

Janina: Sounds good
... summary: there are challenges but also opportunities; important to
bring the wider community along.

Rossen: next steps for TAG?

Janina: (separate issue) Personalization TF has a similar request (for
different reasons) around a prefix.
... *APA to respond to TAG concerns and seek further discussion*
... Thanks TAG

Summary of Action Items

Summary of Resolutions

--
Matthew Tylee Atkinson
--
Senior Accessibility Engineer
The Paciello Group
https://www.paciellogroup.com
A Vispero Company
https://vispero.com
--
This message is intended to be confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, please delete this message from your system and notify us immediately.
Any disclosure, copying, distribution or action taken or omitted to be taken by an unintended recipient in reliance on this message is prohibited and may be unlawful.

Received on Monday, 19 October 2020 09:34:34 UTC