- From: Noreen Whysel <nwhysel@gmail.com>
- Date: Wed, 11 Jan 2023 14:08:37 -0500
- To: Deborah Dahl <Dahl@conversational-technologies.com>
- Cc: public-voiceinteraction@w3.org
Sorry to miss today. Just returned from vacation. Noreen > On Jan 11, 2023, at 1:11 PM, Deborah Dahl <Dahl@conversational-technologies.com> wrote: > > As updated during today's call. > We will resume discussion during the next call, January 25, at section 3.2.4.5. > During the next call, we will also discuss publishing the 1.0 version of the interfaces document > (https://w3c.github.io/voiceinteraction/voice%20interaction%20drafts/paInterfaces/paInterfaces.htm) > > > Implied Architecture Requirements > January 11, 2022 > > 1 > 1. Intelligent Personal Assistants (IPA's) MUST be able to provide general purpose information > 2. Specialized virtual assistants MUST be able to provide enterprise-specific information > 3. Specialized virtual assistants MAY be able to provide non-enterprise-specific information > 4. IPA's SHOULD be able to perform transactions > 5. Specialized assistants MUST be able to interoperate with general IPA's > 6. IPA's SHOULD be able to execute operations in a user's environment > 7. IPA's MUST be able to interact with users through voice or text (language?) or both. > > 2.1.1 > 1. IPA's MUST be able to transfer a partially completed task to another IPA > > 3.0 Architecture > > 1. The architecture SHOULD support question answering and information retrieval > applicationshttps://w3c.github.io/voiceinteraction/voice%20interaction%20drafts/paInterfaces/paInterfaces.htm > 2. The architecture SHOULD support executing local services to accomplish tasks > 3. The architecture SHOULD support executing remote services to accomplish tasks > 4. The architecture MUST support dynamically adding local and remote services or knowledge sources. > 5. It MUST be possible to forward requests from one IPA to another with the same architecture > 6. It MUST be possible to forward requests or partial requests from one IPA to another with the same architecture, omitting the > client layer > 7. IPA extensions MAY be selected from a standardized marketplace > 8. IPA's MAY include a Client layer > 9. IPA's MUST include a Dialog layer > 10. IPA's MAY include an API/Data layer > 11. Components MAY be shifted to other layers as needed (need to clarify with Dirk)? > > 3.1 > 1. The Client layer MAY include a microphone > 2. The Client layer MAY include a means for text input > 3. The Client layer MAY include a speaker > 4. The Client layer MAY include a display > 5. Additional (non-speech) output modalities MAY be employed to render output or to capture input > > 3.1.3 > > 1. The IPA Client MUST allow activation and deactivation by means of a Client Activation Strategy. > 2. As an extension IPA Clients MAY also capture input via text and output text. > 3. As an extension IPA Clients MAY also capture input from various modality recognizers. > 4. As an extension IPA Clients MAY also capture contextual information, e.g., location, time, environmental sounds or other > inputs that it obtains from Local Data Providers. > 5. As an extension an IPA Client MAY also receive commands to be executed locally in the Local Services. > 6. As an extension an IPA Client MAY also receive multimodal output to be rendered by a respective modality synthesizer. > 7. IPA Clients MAY reference a session identifier. > 8. Accessibility to be discussed > > 3.2.2.1 > 1. The IPA Client MUST be activated with a Client Activation Strategy > 2. The Client Activation Strategy MAY be push-to-talk > 3. The Client Activation Strategy MAY be hotword > 4. The Client Activation Strategy MAY be triggered by an interpreted text string (either from audio or text) > 5. The Client Activation Strategy MAY be a change in environment > 6. The Client Activation Strategy MAY be triggered by a script or environmental condition > 7. The Client Activation Strategy MAY be a different strategy not enumerated here > > 3.2.2.2 > 1. The IPA Client MUST include a Local Service Registry > 2. The Local Service Registry MUST maintain a list of Local Services > 3. The Local Service Registry MUST maintain a list of Local Data Providers > > 3.2 Dialog Layer > > 3.2.1 IPA Service > 1. The IPA Client SHOULD forward audio data and metadata (if any) to the IPA Service > 2. The IPA Client MAY forward audio data and metadata (if any) to the Dialog Manager > 3. The IPA Service MUST forward audio data and metadata (if any) to the Dialog Manager > 4. The IPA Service MUST forward audio data and metadata (if any) to the Local IPA > 5. The IPA Service MUST forward text data and metadata (if any) to the Dialog Manager > 6. The IPA Service MUST forward text data and metadata (if any) to the Local IPA > 7. The IPA Service MUST forward multimodal data and metadata (if any) to the Dialog Manager > 8. The IPA Service MUST forward multimodal data and metadata (if any) to the Local IPA > > 9. The IPA Service MUST forward audio output from the TTS to the IPA Client > 10. The IPA Service MUST forward multimodal output from the Dialog Manager to the modality renders > 11. The IPA Service MUST forward text output from the NLG to the IPA Client > > 3.2.2 ASR > 1. The ASR MUST generate one or more recognition hypotheses from voice input that it receives from the IPA Service > 2. The ASR MAY associate recognition hypotheses with confidence scores > 3. The ASR MUST forward the recognition hypotheses to the NLU > 4. The ASR MAY update the History with the recognition hypotheses > > 3.2.3 NLU > > 1. The NLU MUST extract textual interpretations from text strings (either from audio or text) > 2. The NLU MAY extract multiple interpretations from input text strings (either from audio or text) > 3. The NLU MUST be able to interpret input Core Intent Sets > 4. The NLU MUST be able to interpret spoken activation strategies that require interpretation, if they exist > 5. The NLU MAY make use of the Core Data Provider to access local or internal data or access external services. (revisit Core > Data Provider, are we still using that?) > 6. The NLU MAY make use of the Context to check for complementary information such as information in the history or knowledge > 7. The NLU MUST forward the semantic interpretation of the input to the Dialog Manager > 8. The NLU MAY associate statistical confidences with interpretations > 9. The NLU MAY extract emotion, intention, or sentiment from text strings either from audio or text) > > > 3.2.4 Dialog Manager > 1. The Dialog Manager MUST recognize when the user goals are changed > 2. The Dialog Manager SHOULD confirm when the user goals are changed > 3. The Dialog Manager MAY consider ongoing workflows that must not be interrupted when the user switches goals. > 4. The Dialog Manager SHOULD update the History with dialog moves > 5. The Dialog Manager MUST receive the next dialog move as output from the selected Dialog or the IPA Service > > 6. ??The Dialog Manager makes use of the NLG to generate audio data to be rendered on the IPA Client > a. This should be "generate text" I think > > 7. The Dialog Manager MAY provide commands to be executed by the IPA Client or the External Services > > 3.2.5 Context > 1. The Context MAY make use of the Local Service Registry to include external knowledge from Local Data Providers > 2. The Context MAY make use of the Provider Selection Service to include external knowledge from Data Providers > 3. The Context MAY provide external knowledge temporarily to the Knowledge Graph to be considered in reasoning. > > 3.2.5.1 History > 1. The Dialog History MAY store the past dialog events per user. > > > 3.3 API's/Data Layer > > 2. The Provider Selection Service MAY receive input from the Dialog Manager to query data from Data Providers. > 3. The Provider Selection Service MAY receive input from the Dialog Manager to execute External Services. > 4. If the Provider Selection Service is called with a preselected identifier of an IPA provider, it MUST use the preselected > provider > 5. If the Provider Selection Service is not called with a preselected identifier of an IPA provider, the Provider Selection > Service > 6. MUST follow a Provider Selection Strategy to determine those IPA Providers that are best suited to answer the request. > > >
Received on Wednesday, 11 January 2023 19:08:52 UTC