RE: [voiceinteraction] some requirements derived from the Architecture document

I added a few  implied requirements from the architecture document
(https://w3c.github.io/voiceinteraction/voice%20interaction%20drafts/paArchitecture-1-2.htm) for discussion during Wednesday's call.
 
The MUST, MAY, SHOULD language  for requirements is defined in https://www.ietf.org/rfc/rfc2119.txt.

Implied Architecture Requirements

1
Intelligent Personal Assistants (IPA's) MUST be able to provide general purpose information
Specialized virtual assistants MUST be able to provide enterprise-specific information
IPA's SHOULD be able to perform transactions
Specialized assistants MUST be able to interoperate with general IPA's
IPA's SHOULD be able to execute operations in a user's environment
IPA's MUST be able to interact with users through voice or text (language?) or both.

2.1.1
IPA's MUST be able to transfer a partially completed task to another IPA

3.0 Architecture

The architecture SHOULD support question answering and information retrieval applications
The architecture SHOULD support executing local services to accomplish tasks
The architecture SHOULD support executing remote services to accomplish tasks
The architecture MUST support dynamically adding local and remote services or knowledge sources.
It MUST be possible to forward requests from one IPA to another with the same architecture
It MUST be possible to forward requests from one IPA to another with the same architecture, omitting the client layer
IPA extensions MAY be selected from a standardized marketplace
IPA's MUST include a Client layer
IPA's MUST include a Dialog layer
IPA's MAY include an API/Data layer
Components MAY be shifted to other layers as needed

3.1 
The Client layer MAY include a microphone
The Client layer MAY include a speaker
Additional (non-speech) output modalities MAY be employed to render output

3.1.3

The IPA Client MUST be activated by means of a Client Activation Strategy.
As an extension IPA Clients MAY also capture input via text and output text.
As an extension IPA Clients MAY also capture input from a specific modality recognizer.
As an extension IPA Clients MAY also capture contextual information, e.g. location, that it obtains from Local Data Providers.
As an extension an IPA Client MAY also receive commands to be executed locally in the Local Services.
As an extension an IPA Client MAY also receive multimodal output to be rendered by a respective modality synthesizer.
IPA Clients MAY reference a session identifier.

3.2.2.1
The IPA Client MUST be activated with a Client Activation Strategy
The Client Activation Strategy MAY be push-to-talk
The Client Activation Strategy MAY be hotword
The Client Activation Strategy MAY be a change in environment
The Client Activation Strategy MAY be a different strategy not enumerated here

3.2.2.2
The IPA Client MUST include a Local Service Registry
The Local Service Registry MUST maintain a list of Local Services
The Local Service Registry MUST maintain a list of Local Data Providers

3.2 Dialog Layer

3.2.1 IPA Service
The IPA Client SHOULD forward audio data and metadata (if any) to the IPA Service
The IPA Client MAY forward audio data and metadata (if any) to the Dialog Manager
The IPA Service MUST forward audio data and metadata (if any) to the Dialog Manager
The IPA Service MUST forward audio data and metadata (if any) to the Local IPA
The IPA Service MUST forward text data and metadata (if any) to the Dialog Manager
The IPA Service MUST forward text data and metadata (if any) to the Local IPA
The IPA Service MUST forward multimodal data and metadata (if any) to the Dialog Manager
The IPA Service MUST forward multimodal data and metadata (if any) to the Local IPA

The IPA Service MUST forward audio output from the TTS to the IPA Client
The IPA Service MUST forward multimodal output from the Dialog Manager to the modality renders
The IPA Service MUST forward text output from the NLG to the IPA Client

3.2.2 ASR
The ASR MUST generate one or more recognition hypotheses from voice input that it receives from the IPA Service
The ASR MAY associate recognition hypotheses with confidence scores
The ASR MUST forward the recognition hypotheses to the NLU
The ASR MAY update the History with the recognition hypotheses

3.2.3 NLU

The NLU MUST extract interpretations from text strings
The NLU MUST be able to interpret Core Intent Sets
The NLU MAY make use of the Core Data Provider to access local or internal data or access external services.
The NLU MAY make use of the Context to check for complementary information
The NLU MUST forward the semantic input to the Dialog Manager
The NLU MAY generate multiple interpretations from input text strings
The NLU MAY associate confidences with interpretations

3.2.4 Dialog Manager
The Dialog Manager MUST fill in all known slots before prompting the user for additional slots
The Dialog Manager MUST the best suited input from the available input alternatives for further processing
The Dialog Manager MUST expect that the user may switch the goals at any time
The Dialog Manager MUST consider ongoing workflows that must not be interrupted
The Dialog Manager MAY update the History with dialog moves
The Dialog Manager MUST determine the Dialog that is best suited to serve the current user input
The Dialog Manager MUST receive the next dialog move as output from the selected Dialog or the IPA Service

??The Dialog Manager makes use of the NLG to generate audio data to be rendered on the IPA Client
	This should be "generate text" I think

The Dialog Manager MAY provide commands to be executed by the IPA Client or the External Services

3.2.5 Context
The Context MAY make use of the Local Service Registry to include external knowledge from Local Data Providers
The Context MAY make use of the Provider Selection Service to include external knowledge from Data Providers
The Context MAY provide external knowledge temporarily to the Knowledge Graph to be considered in reasoning.

3.2.5.1 History
The Dialog History MAY store the past dialog events per user.


3.3 API's/Data Layer

The Provider Selection Service MAY receive input from the Dialog Manager to query data from Data Providers.
The Provider Selection Service MAY receive input from the Dialog Manager to execute External Serives.
If the Provider Selection Service is called with a preselected identifier of an IPA provider, it MUST use the preselected provider
If the Provider Selection Service is not called with a preselected identifier of an IPA provider, the Provider Selection Service
MUST follow a Provider Selection Strategy to determine those IPA Providers that are best suited to answer the request.

Received on Monday, 22 August 2022 20:32:34 UTC