[voiceinteraction] minutes April 24, 2024

https://www.w3.org/2024/04/24-voiceinteraction-minutes.html
and below as text

Note that next time we'll continue this discussion and talk about the provider selection strategy and how to chain everything
together.

   [1]W3C

      [1] https://www.w3.org/

                             - DRAFT -
                           Voice Interaction

24 April 2024

   [2]Agenda. [3]IRC log.

      [2] https://lists.w3.org/Archives/Public/public-voiceinteraction/2024Apr/0010.html
      [3] https://www.w3.org/2024/04/24-voiceinteraction-irc

Attendees

   Present
          debbie, dirk, gerard

   Regrets
          -

   Chair
          debbie

   Scribe
          ddahl

Contents

    1. [4]reference implementation

Meeting minutes

  reference implementation

   [5]https://github.com/w3c/voiceinteraction/tree/master/source/
   w3cipa

      [5] https://github.com/w3c/voiceinteraction/tree/master/source/w3cipa

   dirk: review reference implementation
   . ChatGPT and Mistral
   . framework is mainly headers
   . component that accesses GPT, also demo program

   dirk reviews SOURCE.md

   input listener for modality inputs
   . in this case just selects the first one
   . goes to ModalityManager
   . can add modality components as you like
   . startInput and handleOutput
   . this is part of the framework, so Royalty Free
   . modality type is a free string so it can be extensible
   . only text is implemented in the reference implementation
   . some modality components could be both input and output
   . one instance that knows all listeners and that all modality
   components would know
   . looking at one example of a modality component, textModality

   debbie: can there be more than one InputModalityComponent?

   dirk: in theory, yes
   . we might have scaling issues with multiple text inputs, for
   example

   debbie: take "first" out of name
   "TakeFirstInputModalityComponent" to make it more general

   dirk: moving on to DialogLayer, IPA Service
   . IPA for both local and anything else we have
   . ReferenceIPAService consumes data from Client
   . could serve multiple clients or if we have local and other
   IPA services
   . no DialogManager in place
   . if there was one, the IPA service would send the input to it
   and then after that the IPA service would forward the output
   back to the client
   . the ExternalIPA/Provider Selection Service
   . the Provider Selection Service for now only knows about
   ChatGPT
   . IPA provider supports input from different modalities

   debbie: should we standardize on define modality types, e.g.
   "voice" vs "speech"

   dirk: would like to talk about ProviderSelectionStrategy and
   how components are glued together

   debbie: we can talk more in the next call
   . could we list the parts of the architecture that aren't
   implemented yet?

   dirk. that might make sense

   debbie: could there be an UML diagram?

   dirk: there could be more diagrams
   . could link from code to specification

   dirk: next time talk about the provider selection strategy and
   how to chain everything together

   debbie: will try running

   dirk: demo running with ChatGPT

   gerard: which version of Mixtral do you use?
   . open source version

   hugues: the next version will not be open source

   gerard: the approach is mixture of experts

   dirk: what happens if we ask both at the same time?
   . would receive them both

   gerard: could use an LLM to summarize
   . that's what Mixtral is using with the Mixture of Experts


    Minutes manually created (not a transcript), formatted by
    [6]scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).

      [6] https://w3c.github.io/scribe2/scribedoc.html

Received on Wednesday, 24 April 2024 17:34:40 UTC