[voiceinteraction] minutes group meeting October 27 ( TPAC 2021)

https://www.w3.org/2021/10/27-voiceinteraction-minutes.html
and below as text.

   [1]W3C

      [1] https://www.w3.org/

                             - DRAFT -
                           voice interaction

27 October 2021

   [2]IRC log.

      [2] https://www.w3.org/2021/10/27-voiceinteraction-irc

Attendees

   Present
          bev, debbie, dirk, kazuyuki, mustaq ahmed, paul grenier

   Regrets
          -

   Chair
          Debbie

   Scribe
          ddahl

Contents

    1. [3]Breakout feedback and expected workshop
    2. [4]Architecture document

Meeting minutes

  Breakout feedback and expected workshop

   <PaulG_> [5]https://www.w3.org/TR/spoken-html/

      [5] https://www.w3.org/TR/spoken-html/

   [6]https://lists.w3.org/Archives/Public/
   public-voiceinteraction/2021Oct/0012.html

      [6] https://lists.w3.org/Archives/Public/public-voiceinteraction/2021Oct/0012.html

   debbie: review discussion from last week's breakout groups

   [7]https://web-eur.cvent.com/event/
   2b77fe3d-2536-467d-b71b-969b2e6419b5/
   websitePage:efc4b117-4ea4-4be5-97b4-c521ce3a06db

      [7] https://web-eur.cvent.com/event/2b77fe3d-2536-467d-b71b-969b2e6419b5/websitePage:efc4b117-4ea4-4be5-97b4-c521ce3a06db

   <kaz> [8]https://www.w3.org/2021/10/20-voice-minutes.html

      [8] https://www.w3.org/2021/10/20-voice-minutes.html

   <kaz> [9]https://www.w3.org/2021/10/19-voice-minutes.html

      [9] https://www.w3.org/2021/10/19-voice-minutes.html

   debbie: possibility of a voice workshop

   kaz: how to integrate speech API and SSML in a workshop
   . organized session with voice interoperability session

   kaz: decided to have a workshop, not voice but smart agent
   workshop
   . interoperability, voice interface, accessibility
   . some overlap with semantic web? is that too broad?
   . when we talk about smart agents
   . one or two days, online

   kaz: online workshop is much easier

   <Bev> Perhaps hybrid online and in person?

   kaz: usually takes six months or so, around May

   <Bev> Include the Cognitive Inclusion COGA group

   bev: could also do a hybrid event
   . cognitive inclusion group has some overlap

   <Bev> Information Architecture Community Group is also
   supportive and can participate

   kaz: should have a dedicated session on accessibility

   debbie: to attend need to prepare a position paper and the
   program committee will review

   <Bev> anyone interested can prepare submission position
   proposal to program committee

   <kaz> [10]e.g., Smart Cities Workshop CfP

     [10] https://www.w3.org/2021/06/smartcities-workshop/index.html

   debbie: prerecorded videos with captions
   . need to be provided

   debbie: other topics like Open Voice Network
   . could be included

   paul: disambiguation in Spoken HTML spec, machine learning has
   its own heuristics, but in the meantime author-controlled
   pronunciation would be useful

   paul: trying to get feedback from implementers, can't just
   bring SSML into HTML
   . will have some representation of SSML into HTML, especially
   pronunciation
   . could use this in machine learning

   paul: word clusters could be modified by IPA
   . a layer could map pronunciation to IPA
   . and match to user's intent
   . language, cultural information is missing
   . when input happens, e.g. speech difficulty is like a
   transform over standard language
   . we can transform from word or from sound
   . they could have had a stroke or something that altered their
   speech

   bev: iPads for elderly after dental surgery
   . speech was different
   . could we use this to transform speech

   paul: for SpeechHTML this is the first step
   . if the system doesn't find a match it could look for
   transforms
   . could be useful in a kiosk situation where user can't add
   their preferences

   kaz: two points, one for speech synthesis and one for speech
   recognition
   . for speech output it would be nice to have another layer to
   get correct pronunciation

   <Bev> Kaz: acoustic model

   kaz: for speech input, we might want to include another
   mechanism

   <Bev> Kaz: command input expected actions, speech and gesture

   kaz: such as hardware switch, gesture

   debbie: also Natural Language Interfaces spec

   <kaz> kaz: btw, it would be really nice if you all by chance
   could join the Program Committee for the expected workshop :)

   debbie: can join the program committee

   paul: maybe could join

   bev: could join program committee
   . depends on timing

  Architecture document

   architecture document [11]https://w3c.github.io/
   voiceinteraction/voice%20interaction%20drafts/
   paArchitecture-1-2.htm

     [11] https://w3c.github.io/voiceinteraction/voice interaction drafts/paArchitecture-1-2.htm

   IPA means "intelligent personal assistant"

   dirk: (reviews input architecture)
   . provider selection strategies can be used to select providers

   dirk: (goes through output path)

   bev: question about intent sets
   . could you talk about that a little more

   dirk: information that could be used to fill in slots

   bev: is that a standard?

   dirk: for now this is pretty abstract

   bev: would that include security information

   dirk: thinking in terms of SISR, more like that
   . have to distinguish between local intent sets and provider
   intent sets

   debbie: Emotion ML

   debbie: could be used in input and output

   kaz: don't have any specific comments, should discuss with
   browser and speech vendors
   . should present at workshop
   . EMMA would be a good format for all this data

   kaz: would like to integrate MMI architecture and SCXML for
   interaction management with WoT standards for device management
   . DID (decentralized identifier) standard, there are many
   implementers, based on blockchain, should be a Recommendation
   soon
   . that can be used to identify users and devices, also
   discovery can be handled this way

   debbie: next call will be November 10

Received on Wednesday, 27 October 2021 18:39:40 UTC