[mmiws] Summary of the Workshop on W3C's Multimodal Architecture and Interfaces

Following is the summary of the MMI Architecture Workshop held
on November 16-17, 2007 in Japan.

---
Summary of the Workshop on W3C's Multimodal Architecture and
Interfaces

On 16-17 November the W3C Multimodal Interaction Working Group held a
Workshop on W3C's Multimodal Architecture and Interfaces in Fujisawa,
Japan, hosted by W3C/Keio.

The minutes of the workshop are available on the W3C Web server:
http://www.w3.org/2007/08/mmi-arch/minutes.html

There were 25 attendees from following organizations:
* ACCESS
* Conversational Technologies
* Deutsche Telekom Laboratories
* T-Systems
* IBM
* INRIA
* KDDI R&D Laboratories
* Kyoto Institute of Technology
* Hewlett-Packard Labs India
* Intervoice Inc.
* Microsoft Windows Division
* Openstream Inc
* Opera Software
* Toyohashi University of Technology
* Polytechnic University
* University of Tampere
* W3C

The motivation of the W3C Multimodal Interaction Working Group for
holding the MMI Architecture includes:

* There is great need for multimodal input/output modes these days
 especially for hand-held portable devices with small displays and
 small or nonexistent keypads.

* Accessibility to the Web must be extended so that users are allowed
 to dynamically select the most appropriate modes of interaction
 based on their needs depending on:
   1. their condition
   2. their environment
   3. their modality preferences

* A general and flexible framework should be provided to guarantee
 application authors interoperability among modality-specific
 components from different vendors - for example, speech recognition
 from vendor A and handwriting recognition from vendor B.

This workshop was narrowly focused on identifying and prioritizing
requirements for extensions and additions to the MMI Architecture to
better support speech, GUI, Ink and other Modality Components. Topics
discussed during the Workshop included:

* Multimodal application authoring (Modality Component specific
 grammar, standard authoring approaches, synchronizing multiple
 modalities, error handling, etc.)

* Architecture design (latency of communication, integration with Web
 browsers, fusion/fission of data, device capability, higher level
 control language, etc.)

* User experience (accessibility, user information, application
 context, multiple users, etc.)

* Topics that need further clarification (role of Interaction Manager,
 application specific management, direct communication between
 modality components, etc.)

We have generated a list of issues and requirements about the current
MMI Architecture through the workshop:
http://www.w3.org/2007/08/mmi-arch/topics.html

The major "takeaways" are:

* Multimodal applications use various modalities including GUI,
 speech, handwriting, etc. Some of the Modality Components,
 e.g. kinesthetic sensor input on mobile devices, etc., need modality
 specific grammars for converting user input to concrete events.

* Considering that all data must be communicated between the
 Interaction Manager and the Modality Components, latency of
 communication may be problematic for real-time applications.

* Integration of multimodal applications with ordinary Web browsers is
 a key question. Starting with integrating them as a plug-in
 application might be a quick option.

* The capabilities of each handset and the user's preferences should
 be available to the Interaction Manager to allow the application to
 be adapted to accommodate both the handset and the user.

The W3C Multimodal Interaction Group will review those new topics and
use the list as a guideline for future enhancements to the MMI
Architecture.

Deborah Dahl and Kazuyuki Ashimura, Workshop Co-chairs

-- 
Kazuyuki Ashimura / W3C Multimodal & Voice Activity Lead
mailto: ashimura@w3.org
voice: +81.466.49.1170 / fax: +81.466.49.1171

Received on Friday, 23 November 2007 22:22:51 UTC