PF comments on EMMA Last Call from Al Gilman on 2005-12-14 (www-multimodal@w3.org from December 2005)

From: Al Gilman <Alfred.S.Gilman@IEEE.org>
Date: Wed, 14 Dec 2005 10:25:53 -0500
To: www-multimodal@w3.org
Cc: wai-liaison@w3.org
Message-Id: <p06110401bfc5e9758d22@[10.0.1.2]>
<comments class="lastCall fromGroup">

<note class="inTransmittal">

Thank you for the opportunity to participate in the Last Call review for
the EMMA specification.

The following comments have reached rough consensus in the Protocols
and Formats Working Group.

We look forward to working with you where these comments are unclear
or you feel they expose areas that require further analysis.

Al
/chair, PFWG

</note>

1. We are concerned that in an approach that focuses on input and output
modalities that are "widely used today" Assistive Technology devices might
be left out in practice.  Although theoretically it seems to be possible to
apply EMMA to all types of input and output devices (modalities), including
Assistive Technology, the important question is "Who is going to write the
device-specific code for Assistive Technology devices?"

If this is outside the scope of EMMA, please let us know who we should
address with this question.


2. Adaptation to delivery context

2.1 system and environment

Composite input should provide environmental information. Since input
is used to define a response, the system response should take into
account environmental conditions that should be captured at input
time. Here are some examples:

Signal to Noise Ratio (SNR)
Lighting conditions
Power changes (may throw out input or prompt user to re-enter information)

In the case of a low SNR you might want to change the volume, pitch,
or if the system provides it - captioning. Sustained SNR issues may
result in noise cancellation to improve voice recognition. This
should be included with EMMA structural elements. Some of these
issues could be reflected in confidence but the confidence factor
provides no information as to why the confidence level is low and how
to adapt the system.

2.2 User factors

How does the Emma group plan to address user capabilities. ... At
the Emma input level or somewhere else in the system? Example: I may
have a hearing impairment changing the situation for me over another
person. If multiple people are accessing a system it may be important
to address the user and their specific capabilities for adaptive
response.


3. Settling time

How does this technology address settling time and multiple keys being hit.
People with mobility impairments may push more than one key,
inadvertently hit specific keys, or experience tremors whereby it
needs to be smoothed. This may or may not effect confidence factors
but again the "why" question comes up. This information may need to
be processed in the drivers.

4. Directional information

Should we have an emma:directional information? Examples are right,
left, up, down, end, top, north, south, east, west, next, previous.
These could be used to navigate a menu with arrow keys, voice reco,
etc. They could be used to navigate a map also. This addresses device
independence. This helps with intent-based events.

We should include into and out of to address navigation up and down
the hierarchy of a document as in DAISY. The device used to generate
this information should be irrelevant. Start, Stop, reduce speed, may
also be an addition. These higher levels of navigation may be used to
control a media player independent of the device.

5. Zoom: What about Zoom out?

6. Device independence and keyboard equivalents

For the laptop/desktop class of client devices, there has been a "safe
haven" input channel provided by the keyboard interface.  Users who
cannot control other input methods have assistive technologies that
at least emulate the keyboard, and so full command of applications is
required from the keyboard.  Compare with Checkpoints 1.1 and 1.2 of
the User Agent Accessibility Guidelines 1.0 [UAAG10].

[UAAG10] 
http://www.w3.org/TR/UAAG10-TECHS/guidelines.html#gl-device-independence

How does this MMI Framework support having the User Agent supply the
user with alternate input bindings for un-supported modalities
expected by the application?

How will applications developed in this MMI Framework (EMMA
applications) meet the "full functionality from keyboard"
requirement, or what equivalent facilitation is supported?

7. Use cases

To make things more concrete, we have compiled the following use cases
to be investigated by the MMI group as Assistive Technology use cases which
might bear requirements beyond the typical mainstream use cases.  We are
willing to discuss these with you in more detail with the goal of coming to
a joint conclusion about their feasibility in EMMA.

(a) Input by switch.  The user is using an on-screen keyboard and inputs
each character by scanning over the rows and columns of the keys and hitting
the switch for row and column selection.  This takes significantly more time
than the average user would take to type in the characters.  Would this
switch-based input be treated like any keyboard input (keyboard emulation)?
If yes, could the author impose time constraints that would be a barrier to
the switch user?  Or, alternatively, would this use case require
device-specific (switch-specific) code?

(b) Word prediction.  Is there a way for word prediction programs to
communicate with the interaction manager (or other pertinent components of
the framework) in order to find out about what input is expected from the
user?  For example, could a grammar that is used for parsing be passed on to
a word prediction program in the front end?

(c) User overwrites default output parameters.  For example, voice output
could be described in an application with EMMA and SSML.  Can the user
overwrite (slow down or speed up) the speech rate of the speech output?

(d) WordAloud (http://www.wordaloud.co.uk/).  This is a program that
displays text a word at a time, in big letters on the screen, additionally
with speech output.  How could this special output modality be accommodated
with EMMA?

(e) Aspire Reader (http://www.aequustechnologies.com/), This is a
daisy reader and browser that also supports speech output, word
highlighting, enhanced navigations, extra text and auditory
descriptions that explain the page outline and content as you go,
alterative renderings such as following through key points of content
and game control type navigation. Alternative texts are for the
struggling student (for example a new immigrant)

</comments>
Received on Wednesday, 14 December 2005 15:53:47 UTC