[voiceinteraction] minutes group meeting October 27 ( TPAC 2021) from Deborah Dahl on 2021-10-27 (public-voiceinteraction@w3.org from October 2021)

From: Deborah Dahl <Dahl@conversational-Technologies.com>
Date: Wed, 27 Oct 2021 14:39:25 -0400
To: <public-voiceinteraction@w3.org>
Message-ID: <084f01d7cb61$f7e364a0$e7aa2de0$@conversational-Technologies.com>

https://www.w3.org/2021/10/27-voiceinteraction-minutes.html
and below as text.

[1]W3C

[1] https://www.w3.org/

- DRAFT -
voice interaction

27 October 2021

[2]IRC log.

[2] https://www.w3.org/2021/10/27-voiceinteraction-irc

Attendees

Present
bev, debbie, dirk, kazuyuki, mustaq ahmed, paul grenier

Regrets
-

Chair
Debbie

Scribe
ddahl

Contents

1. [3]Breakout feedback and expected workshop
2. [4]Architecture document

Meeting minutes

Breakout feedback and expected workshop

<PaulG_> [5]https://www.w3.org/TR/spoken-html/

[5] https://www.w3.org/TR/spoken-html/

[6]https://lists.w3.org/Archives/Public/
public-voiceinteraction/2021Oct/0012.html

[6] https://lists.w3.org/Archives/Public/public-voiceinteraction/2021Oct/0012.html

debbie: review discussion from last week's breakout groups

[7]https://web-eur.cvent.com/event/
2b77fe3d-2536-467d-b71b-969b2e6419b5/
websitePage:efc4b117-4ea4-4be5-97b4-c521ce3a06db

[7] https://web-eur.cvent.com/event/2b77fe3d-2536-467d-b71b-969b2e6419b5/websitePage:efc4b117-4ea4-4be5-97b4-c521ce3a06db

<kaz> [8]https://www.w3.org/2021/10/20-voice-minutes.html

[8] https://www.w3.org/2021/10/20-voice-minutes.html

<kaz> [9]https://www.w3.org/2021/10/19-voice-minutes.html

[9] https://www.w3.org/2021/10/19-voice-minutes.html

debbie: possibility of a voice workshop

kaz: how to integrate speech API and SSML in a workshop
. organized session with voice interoperability session

kaz: decided to have a workshop, not voice but smart agent
workshop
. interoperability, voice interface, accessibility
. some overlap with semantic web? is that too broad?
. when we talk about smart agents
. one or two days, online

kaz: online workshop is much easier

<Bev> Perhaps hybrid online and in person?

kaz: usually takes six months or so, around May

<Bev> Include the Cognitive Inclusion COGA group

bev: could also do a hybrid event
. cognitive inclusion group has some overlap

<Bev> Information Architecture Community Group is also
supportive and can participate

kaz: should have a dedicated session on accessibility

debbie: to attend need to prepare a position paper and the
program committee will review

<Bev> anyone interested can prepare submission position
proposal to program committee

<kaz> [10]e.g., Smart Cities Workshop CfP

[10] https://www.w3.org/2021/06/smartcities-workshop/index.html

debbie: prerecorded videos with captions
. need to be provided

debbie: other topics like Open Voice Network
. could be included

paul: disambiguation in Spoken HTML spec, machine learning has
its own heuristics, but in the meantime author-controlled
pronunciation would be useful

paul: trying to get feedback from implementers, can't just
bring SSML into HTML
. will have some representation of SSML into HTML, especially
pronunciation
. could use this in machine learning

paul: word clusters could be modified by IPA
. a layer could map pronunciation to IPA
. and match to user's intent
. language, cultural information is missing
. when input happens, e.g. speech difficulty is like a
transform over standard language
. we can transform from word or from sound
. they could have had a stroke or something that altered their
speech

bev: iPads for elderly after dental surgery
. speech was different
. could we use this to transform speech

paul: for SpeechHTML this is the first step
. if the system doesn't find a match it could look for
transforms
. could be useful in a kiosk situation where user can't add
their preferences

kaz: two points, one for speech synthesis and one for speech
recognition
. for speech output it would be nice to have another layer to
get correct pronunciation

<Bev> Kaz: acoustic model

kaz: for speech input, we might want to include another
mechanism

<Bev> Kaz: command input expected actions, speech and gesture

kaz: such as hardware switch, gesture

debbie: also Natural Language Interfaces spec

<kaz> kaz: btw, it would be really nice if you all by chance
could join the Program Committee for the expected workshop :)

debbie: can join the program committee

paul: maybe could join

bev: could join program committee
. depends on timing

Architecture document

architecture document [11]https://w3c.github.io/
voiceinteraction/voice%20interaction%20drafts/
paArchitecture-1-2.htm

[11] https://w3c.github.io/voiceinteraction/voice interaction drafts/paArchitecture-1-2.htm

IPA means "intelligent personal assistant"

dirk: (reviews input architecture)
. provider selection strategies can be used to select providers

dirk: (goes through output path)

bev: question about intent sets
. could you talk about that a little more

dirk: information that could be used to fill in slots

bev: is that a standard?

dirk: for now this is pretty abstract

bev: would that include security information

dirk: thinking in terms of SISR, more like that
. have to distinguish between local intent sets and provider
intent sets

debbie: Emotion ML

debbie: could be used in input and output

kaz: don't have any specific comments, should discuss with
browser and speech vendors
. should present at workshop
. EMMA would be a good format for all this data

kaz: would like to integrate MMI architecture and SCXML for
interaction management with WoT standards for device management
. DID (decentralized identifier) standard, there are many
implementers, based on blockchain, should be a Recommendation
soon
. that can be used to identify users and devices, also
discovery can be handled this way

debbie: next call will be November 10

Received on Wednesday, 27 October 2021 18:39:40 UTC