W3C home > Mailing lists > Public > public-cognitive-a11y-tf@w3.org > April 2016

RE: ETSI standards and guides referred to in today's COGA TF call

From: Deborah Dahl <dahl@conversational-technologies.com>
Date: Tue, 26 Apr 2016 11:57:27 -0400
To: "'Michael Pluke'" <Mike.Pluke@castle-consult.com>, "'public-cognitive-a11y-tf'" <public-cognitive-a11y-tf@w3.org>
Message-ID: <106a01d19fd4$55b5b570$01212050$@conversational-technologies.com>
Hi Mike, 

I didn't know about ETSI ETR 096, there are some very good and still
relevant ideas there (always provide help, describe a function before the
digit that invokes it, etc.), despite the fact that the document is from
1993. Thanks for finding this. We should refer to this document in the voice
issue paper. 

I agree that there's no harm in recommending that systems implement  ETSI ES
202 076, or at least the basic and digit subsets. However,  I don't think
that's sufficient. In many cases, like the ones you point out, I'm sure
users will be able to guess the correct command. I still maintain that no
one will consciously set out to learn them, especially not all 74 of them.
In real systems, it's much more user-friendly to implement the responses
that show up in actual user testing.  These could, of course, be added to
the ETSI ones (which only covered 85% of the responses they received in
their tests). So then, when someone says "that's correct" instead of "yes",
"quit" instead of "stop" or "end call" instead of "exit" they will be
understood. 

I was unable to find any studies validating the ETSI command set in real
applications, which would be very interesting to see. However, in my
experience, in real, large scale, applications there will always legitimate
user inputs that the developers didn't expect and they need to be
accommodated in user-friendly systems. 

So, basically, in my opinion, there's no harm in asking developers to
implement at least the basic commands, but it is unrealistic to think that
this alone would automatically result in usable systems, especially for
users with cognitive disabilities. I think it's much more important to
provide access to human backup.

I will also check with some professional voice user interface designers and
see what their experience has been with this standard.

Best,

Debbie

 

From: Michael Pluke [mailto:Mike.Pluke@castle-consult.com] 
Sent: Monday, April 25, 2016 7:39 PM
To: public-cognitive-a11y-tf
Subject: ETSI standards and guides referred to in today's COGA TF call

 

After much hunting through fading memory cells I managed to locate the
relevant ETSI document that I referred to in today's COGA TF call. It is:

 

ETSI ETR 096 "Human Factors (HF); Phone Based Interfaces (PBI)

Human factors guidelines for the design of minimum phone based user
interface to computer services:
http://www.etsi.org/deliver/etsi_etr/001_099/096/01_60/etr_096e01p.pdf .

 

It is only a Technical Report and not a standard, but it was developed with
the involvement of the principle North American provider of such service in
those (long ago) days. It does not in general associate digits with
functions, but it does identify "0" as the preferred way to reach an
operator and this is widely, but by no means universally, implemented.

 

The other document that lists potential voice commands in multiple
languages, that I think Debbie is already familiar with, is:

 

ETSI ES 202 076 "Human Factors (HF); User Interfaces; Generic spoken command
vocabulary for ICT devices and services":
http://www.etsi.org/deliver/etsi_es/202000_202099/202076/02.01.01_60/es_2020
76v020101p.pdf 

 

I've always been a little puzzled by Debbie's suggestion that ES 202 076
contains a mass of commands that users won't be able to learn. The commands
were the words that large samples of people said they would use to elicit a
particular function (for each of the 30 languages covered in the standard).
The user requirements section said that "a spoken command vocabulary should
be intuitive, easy to learn, memorable, natural, and unambiguous" and this
set (which I was not involved in developing) seems to me to largely meet
that goal. 

 

So we have such difficult to remember commands as "yes" (or alternatively
"confirm"), "no", the digits 0 to 9 (with, for example, the alternatives of
"zero" or "oh" being acceptable English commands for 0); "record" to record
something, "stop" to stop something, "start" to start something, "help" if
you are after help,  "goodbye" or "exit" to exit a service (hanging up the
phone also works here J), etc. These seem to be the things that most people
would naturally say first and every time, but even if they perversely said
something else they would probably say these commands on subsequent
attempts.

 

There are some commands for telephony functions that I suspect might be
problematic, but that is as much because most people have no idea how
telephone networks work and therefore do not understand the underlying
concepts (like diverting and forwarding functions) that are translated into
commands. I feel that there are a few other commands that might be less
intuitive - so these might have to be learnt, but I feel that these are only
a small minority.

 

This standard is now seven years old and modern systems like Siri, Cortana
and Google Now offer a much more robust understanding of user input. However
I'd be pretty certain that these systems work best when they hear clear and
unambiguous commands like those in ES 202 076 and it would do no harm to
require all systems to recognise and appropriately respond to these commands
in the way described in ES 202 076!

 

Best regards

 

Mike
Received on Tuesday, 26 April 2016 15:57:27 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 26 April 2016 15:57:28 UTC