Re: issues with global attribute "accesskey" as-is in HTML5 from Charles McCathieNevile on 2010-07-29 (public-html-a11y@w3.org from July 2010)

From: Charles McCathieNevile <chaals@opera.com>
Date: Thu, 29 Jul 2010 14:10:08 +0200
To: public-html-a11y@w3.org, "Gregory J. Rosmaita" <oedipus@hicom.net>
Message-ID: <op.vglte6dywxe0ny@widsith.local>
On Thu, 22 Jul 2010 15:35:19 +0200, Gregory J. Rosmaita  
<oedipus@hicom.net> wrote:

...
> so this post is a discussion of the global attribute "accesskey" as it
> currently appears in the HTML5 draft:
>
> http://dev.w3.org/html5/spec/editing.html#the-accesskey-attribute
>
> ISSUE 1. is accesskey irretrievably broken as an attribute?
>
> is an attribute (@accesskey) sufficient to provide the needed
> functionality or should the element approach outlined in the Access
> Module/Element (which has the advantage of having been vetted and
> modified due to feedback provided by the WAI, UAAG and SVG WGs) be
> championed by the A11y TF?

An attribute, and useful javascript interfaces, and decent user agent  
implementation, are sufficient. While an access element could be a useful  
shorthand, the current version has several real problems, and doesn't add  
much functionality while adding increased complexity for implementation.  
As well as building on something that has been around for a decade, is  
established in toolchains and documentation and is clearly understood  
effectively, the attribute can be modified in backwards-compatible ways to  
allow for vastly improved functionality.

In any event, most of the thinking that needs to be done here is not at  
the level of the markup language, but in how to make sensible  
implementations in user agents. HTML4, HTML5, and the access element  
drafts all parrot the same "use the key value plus modifiers like alt or  
command" as essentially their only advice on implementation. This was  
clearly the biggest mistake in the HTML4 specification of accesskey, and  
following it was what led to people suggesting that the whole thing should  
be dropped. Which makes me believe that much of the necessary thinking  
still remains to be done. (In my infinite modesty, I provide below a guide  
to what the necessary thinking will produce. Based on the well-known  
principle that all right-thinking people agree with me ;) ).

> further discussion at:
> * http://www.w3.org/WAI/PF/HTML/wiki/Talk:Access/access_key_requirements
>
> ISSUE 2. Valid Accesskey Values
>
> in reference to accesskey values, the current draft of HTML5 states:
>
> QUOTE
> value must be an ordered set of unique space-separated tokens, each of
> which must be exactly one Unicode code point in length.
> UNQUOTE
>
> PROBLEM 2.1. accesskeys MUST be drawn from the charset used to render
> the natural language of the document in which they appear; an accesskey
> MUST be defined as a single character one Unicode code point in length
> from the document character set.

I don't think this requirement makes sense. I read a fair amount of  
russian, but my keyboard is by default in latin. Likewise, some of my  
colleagues read a lot of english, with a default norwegian or spanish or  
russian keyboard. Being able to suggest a russian and a latin letter as  
proposed keys is IMHO a valauble feature.

By contrast, I don't see the value in requiring a single character value.  
In particular for voice-driven systems, it makes sense to be able to  
suggest an entire word. Given that the values of accesskey are only a  
suggestion from the author, I see no reason not to allow this.

> ISSUE 3. Psuedo-Cascade of Multiple Accesskeys Definable for an  
> Individual Element
...
> PROBLEM 3.1. cascade order is a very "weak" rather than a strong binding
> -- how does the user know what accesskey to use when multiple accesskeys
> are assigned to an individual element?

The author cannot know what activation behaviour (be it a key combination,  
a mouse gesture, or something else) will actually be assigned by the user  
agent.

In the HTML5 draft, the user agent assigns a javascript attribute  
"accessKeyLabel" which can be used by a script to explain what is actually  
required. Otherwise the user agent must make it clear what the activations  
are. Precisely how this happens naturally depends on the user agent - it  
makes no sense to try and specify how a voice-interaction, a large  
multiple-pointer touch screen and a small-screen browser with limited  
keyboard implement their user interfaces, beyond requirements from UAAG  
that they actually be accessible.

> PROBLEM 3.2. "limited group of characters" -- there are a very finite
> number of characters that one can use as an accesskey; is the cascade of
> keys set using a space delimited list global?  (that is, does every first
> item listed belong to accesskey-scheme A, the second to
> accesskey-scheme-B, etc.

According to the current draft (you have to go through the algorithm to  
figure out what is supposed to happen :( ), there is no concept of a  
scheme - for each individual accesskey the user agent assigns the first  
value in the list that is feasible. See below for further discussion of  
this...

> ISSUE 4. HTML5 hard-binds "Action" to accesskey key-press:
>
> QUOTE  
> src="http://dev.w3.org/html5/spec/commands.html#command-facet-action"
>
> When the user presses the key combination corresponding to the assigned
> access key for an element, if the element defines a command, and the
> command's Hidden State facet is false (visible), and the command's
> Disabled State facet is also false (enabled), then the user agent must
> trigger the Action of the command.
> UNQUOTE
>
> PROBLEM 4.1. this is at variance with the Access Module/Element's
> architecture, which provides a boolean attribute "activate" which allows
> an author to set the AccessKeyPress to either move focus to the object
> for which the accesskey is defined (activate="no") or whether the
> AccessKeyPress results in the activation of the element to which the
> accesskey has been assigned (activate="yes")

Sure. I don't see that the author should be specifying whether to focus or  
activate anyway. Among other things, it becomes to easy to introduce  
click-jacking type problems in most of the existing implementations.

> PROBLEM 4.2. the Access Module/Element's "activate" attribute also
> provides the following, which is lacking in the HTML5 draft:
>
> QUOTE src="http://www.w3.org/TR/xhtml-access/#sec_3.1.1."
> User agents MUST provide mechanisms for overriding the author setting
> with user-specified settings in order to ensure that the act of moving
> content focus does not cause the user agent to take any further action
> (as per Checkpoint 9.5 of UAAG 1.0)
> UNQUOTE

Indeed. This is a user agent requirement, and belongs in UAAG. The HTML 5  
draft could refer informatively to that advice, and/or repeat it. But the  
decision to activate or focus should be made in the user agent  
implementation, with user-override, not in the HTML specification and not  
by the content author.

There are some more basic issues with the current draft:

The HTML5 draft effectively legitimises existing usage of the accesskey  
attribute (which is reasonable), and the "dominant paradigm" of  
implementation - use the value plus some arbitrary modifiers - which is  
somewhere between unreasonable and stupid, depending on the user agent  
design.

It also completely fails to anticipate the real range of devices that  
people use.

Gestures, rather than keyboard interactions, are an increasinyl common way  
to interact with a browser, covering hundreds of millions of users today.  
Opera's desktop browser has had this functionality for about a zillion  
years, there is a firefox extension, the Wii and the iPhone don't have a  
viable alternative, and apple laptops are starting to introduce it.

Likewise voice interaction, while less common, is hardly unknown and has  
been around for years.

Instead of assuming a keyboard-based interaction, and simply giving up if  
it doesn't exist, the draft should recognise the presence of an accesskey  
attribute as an indicator that a particular element has (according to the  
author) relatively high priority for user interaction. The key function of  
the attribute is to note particular elements, not to suggest a particular  
interaction. The current draft doesn't offer anything much that is useful  
for voice- or gesture-based interfaces, and that really is a problem.

Instead of being primarily about assigning some key combination, HTML5  
should specify accesskey primarily as denoting an element which requires a  
more rapid access than normally available - be that through a shortcut  
key, a distinct gesture, or a modal interaction such as a menu.

The question of precisely what interaction is assigned is secondary. But  
there is a lot of value in making it predictable for authors, although it  
must eventually be controlled by the user agent and controllable by the  
user, e.g. to allow for a compound user agent such a s a browser combined  
with some assistive technology, or a configuration adapted to a particular  
user whose "q" and "a" keys stopped working when they spilt a beer in the  
keyboard, or whatever.

The accesskey attribute does not exist in a conceptual vacuum. User agents  
already define interaction behaviours for various types of element, and  
for elements with particular attributes or attribute values: role,  
tabindex, onclick, rel are some such attributes. The important difference  
is that there are a predefined set of elements/attributes with agreed  
meanings, and it makes sense for the user agent to simply design an  
interface that allows the user to interact with them. The development of  
such interfaces is a key point of differentiation between user agents, and  
we should not attempt to prescribe it too closely (lest we repeat the  
mistakes of early HTML4 accesskey implementations on desktop, if for no  
other reason).

Authors should use accesskey in cases where there is no predefined and  
agreed semantic label for the interaction they want to create, or where  
they want to explicitly prioritise some kinds of interaction higher than  
user agents normally do. Use cases include navigational landmarks (e.g.  
key sections of a form), game controls (which have probably never been  
done because accesskey implementations are almost universally horrible  
except on mobile phone microbrowsers incapable of running a game in the  
first place), and adapting content to a relatively small "viewing space",  
whether a small screen or window, a linear representation such as given by  
a screen reader or voice browser, or something I haven't thought of yet.

Authors should suggest a memorable interaction behaviour. If something is  
high priority for interaction, it makes sense that the interaction is  
relatively easy to achieve, which includes remembering how to do it. Since  
the things which should get accesskeys are not common enough to be agreed  
already, presumably the author has an idea of what they are and how to  
communicate them which cannot be automatically replicated by some  
algorithm or heuristic.

The HTML5 spec should provide a clear idea of what user agents might do  
with an accesskey attribute. Right now it does that to some extent, for  
keyboard interaction. It should be generalised, to be relevant to  
non-keyboard interactions - something that it sort of does with the  
fallback step in the algorithm, and in allowing for "other ways" of  
exposing the element. I believe that we can do better.

One concrete improvement would be to allow for words as well as single  
characters. These are most likely to be used in voice interactions (where  
a single letter is pretty limiting), but could also be used in a menu, to  
suggest gestures, or keys such as the soft-keys on a phone, the function  
keys on a standard keyboard, or combinations including both shift-4 and  
emacs-style ctrl-f,ctrl-x sequences, bearing in mind that in all cases  
these are *suggestions* from the author, and the user agent is not bound  
to apply any one of them.

User agents should provide *some* form of rapid access to things with an  
accesskey - whether that be through a special menu, a key combinatin, or  
some other interaction. Likewise, user agents should ensure that the  
interaction doesn't interfere with the existing interface, and that users  
can discover what it is.

One potential benefit to getting this right would be dealing with the  
games use case, and more generally the zillions of stupid javascript  
keystroke sniffers, which are terrible for device independence,  
localisation, and so on. This would involve making it possible to use  
accesskeys continually (something that iCab sort of did, and many  
phone-orieted microbrowsers did) which would make them an attractive  
mechanism to developers.

[sending this because I am allegedly on vacation, and have limited  
connectivity. I am sure there is more to add, but it will have to wait  
until next week]

cheers

Chaals

-- 
Charles McCathieNevile  Opera Software, Standards Group
     je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals       Try Opera: http://www.opera.com
Received on Thursday, 29 July 2010 12:11:17 UTC