[voiceinteraction] minutes April 23, 2025

https://www.w3.org/2025/04/23-voiceinteraction-minutes.html
and below as text
   [1]W3C

      [1] https://www.w3.org/

                             - DRAFT -
                           Voice Interaction

23 April 2025

   [2]IRC log.

      [2] https://www.w3.org/2025/04/23-voiceinteraction-irc

Attendees

   Present
          debbie, dirk, gerard, kaz

   Regrets
          -

   Chair
          debbie

   Scribe
          dadahl

Contents

    1. [3]GitHub issue 65, compare semantic representations
    2. [4]workshop

Meeting minutes

  GitHub issue 65, compare semantic representations

   JROSI features

   nbest with start and stop times and "medium" and "mode", medium
   possibly not needed

   version

   multiple interpretations

   every interpretation has tokens, id and confidence

   debbie: do we need all semantic structure now because of LLMs

   dirk: is emma really suited to cover LLM systems

   debbie: we don't need all this semantic structure today

   dirk: maybe only a subset would be needed

   debbie: maybe not necessary for interactive systems

   dirk: the full EMMA spec would be overkill

   debbie: a lot of EMMA is optional

   debbie: we're talking about requirements for voice interaction

   dirk: we could use some EMMA metadata and multimodal
   information

   debbie: EMMA 1.0 and JROSI don't cover streaming

   debbie: should we include streaming?

   dirk: we should support streaming to an endpoint

   debbie: we could look for streaming in old EMMA 2.0 document
   and see if we could use that

   dirk: maybe use a subset

   debbie: work with Open Voice on streaming

   dirk: let's look at Open Voice

   [5]https://github.com/open-voice-interoperability/docs/blob/
   main/specifications/DialogEvents/1.0.2/
   InteropDialogEventSpecs.md

      [5] https://github.com/open-voice-interoperability/docs/blob/main/specifications/DialogEvents/1.0.2/InteropDialogEventSpecs.md

   timestamp, id, speakerUri

   debbie: looking at section 1.4 of Dialog Events

  workshop

   <kaz> [6]Draft CfP

      [6] https://github.com/w3c/smartagents-workshop/blob/main/README.md

   <kaz> * should rather concentrate on voice interaction

   <kaz> * how to deal with use cases like connected cars?

   <kaz> * should mention what would be the impact to the Web
   platform

   kaz: brought CFP to W3C Strategy meeting

   kaz: avoid dealing with AI-based agents concentrate on voice

   dirk: voice and smart agents overlap

   kaz: impact on the web platform, but standards are not just web
   browser, for example web data
   . how to identify person and credentials

   dirk: doesn't this overlap with security?

   kaz: security is very important to smart agents

   dirk: can we include smart agents

   kaz: will update proposal
   . core target should be voice and multimodal interaction

   kaz: will update proposal and take it back to W3C Strategy
   . will check on when to form PC

   kaz: can start to work on converting MD to HTML

   dirk: should wait to create a table of possible topics until
   the proposal is approved

   kaz: next strategy meeting will be in two weeks

Received on Wednesday, 23 April 2025 15:37:39 UTC