- From: Deborah Dahl <Dahl@conversational-Technologies.com>
- Date: Wed, 24 Apr 2024 13:33:52 -0400
- To: <public-voiceinteraction@w3.org>
https://www.w3.org/2024/04/24-voiceinteraction-minutes.html
and below as text
Note that next time we'll continue this discussion and talk about the provider selection strategy and how to chain everything
together.
[1]W3C
[1] https://www.w3.org/
- DRAFT -
Voice Interaction
24 April 2024
[2]Agenda. [3]IRC log.
[2] https://lists.w3.org/Archives/Public/public-voiceinteraction/2024Apr/0010.html
[3] https://www.w3.org/2024/04/24-voiceinteraction-irc
Attendees
Present
debbie, dirk, gerard
Regrets
-
Chair
debbie
Scribe
ddahl
Contents
1. [4]reference implementation
Meeting minutes
reference implementation
[5]https://github.com/w3c/voiceinteraction/tree/master/source/
w3cipa
[5] https://github.com/w3c/voiceinteraction/tree/master/source/w3cipa
dirk: review reference implementation
. ChatGPT and Mistral
. framework is mainly headers
. component that accesses GPT, also demo program
dirk reviews SOURCE.md
input listener for modality inputs
. in this case just selects the first one
. goes to ModalityManager
. can add modality components as you like
. startInput and handleOutput
. this is part of the framework, so Royalty Free
. modality type is a free string so it can be extensible
. only text is implemented in the reference implementation
. some modality components could be both input and output
. one instance that knows all listeners and that all modality
components would know
. looking at one example of a modality component, textModality
debbie: can there be more than one InputModalityComponent?
dirk: in theory, yes
. we might have scaling issues with multiple text inputs, for
example
debbie: take "first" out of name
"TakeFirstInputModalityComponent" to make it more general
dirk: moving on to DialogLayer, IPA Service
. IPA for both local and anything else we have
. ReferenceIPAService consumes data from Client
. could serve multiple clients or if we have local and other
IPA services
. no DialogManager in place
. if there was one, the IPA service would send the input to it
and then after that the IPA service would forward the output
back to the client
. the ExternalIPA/Provider Selection Service
. the Provider Selection Service for now only knows about
ChatGPT
. IPA provider supports input from different modalities
debbie: should we standardize on define modality types, e.g.
"voice" vs "speech"
dirk: would like to talk about ProviderSelectionStrategy and
how components are glued together
debbie: we can talk more in the next call
. could we list the parts of the architecture that aren't
implemented yet?
dirk. that might make sense
debbie: could there be an UML diagram?
dirk: there could be more diagrams
. could link from code to specification
dirk: next time talk about the provider selection strategy and
how to chain everything together
debbie: will try running
dirk: demo running with ChatGPT
gerard: which version of Mixtral do you use?
. open source version
hugues: the next version will not be open source
gerard: the approach is mixture of experts
dirk: what happens if we ask both at the same time?
. would receive them both
gerard: could use an LLM to summarize
. that's what Mixtral is using with the Mixture of Experts
Minutes manually created (not a transcript), formatted by
[6]scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).
[6] https://w3c.github.io/scribe2/scribedoc.html
Received on Wednesday, 24 April 2024 17:34:40 UTC