A user agent is a set of modules that retrieves Web resources, renders
them, allows control of those processes, and communicates with other software.
An assistive technology is a user agent that (1) relies on another user agent
to provide some services and (2) provides services beyond those offered by the
"host user agent" to meet the requirements of a target
audience. This document examines User Agent Guidelines
Working Group rationale for establishing which user agent functionalities
must be supported natively by general purpose user agents and which are
expected to be supported by assistive technologies.
Status of this Document
This document does not represent consensus of the User Agent
Accessibility Guidelines Working Group. As of the date at the top of the
document, it only represents the musings of Ian Jacobs, who hopes it will serve as a
support document for the User Agent Accessibility Guidelines as they advance on
the Recommendation track.
Table of Contents
The User Agent Accessibility
Guidelines include two types of requirements for general purpose user
- Requirements for native implementation of some functionalities (i.e., the
user agent implements them without requiring additional software other than the
- Communication through Application Programming Interfaces (APIs). The
Guidelines require user agents to allow read and write ?audio and video
too? access to both content and user interface
The second set of requirements allows assistive technologies (ATs) to offer
missing functionalities not offered natively. Since the Guidelines require that
ATs have access to both content and UI controls, in theory, general-purpose
user agents have to implement natively few functionalities related to
accessibility since ATs can fill in the gaps. They might even do a better job
since developers of specialized tools know their target audience well.
These specialized ATs are commonly referred to as plug-ins.
The Working Group has decided that general-purpose user agents must
implement some important functionalities natively rather than relying on
Assistive Technologies to shoulder the load. One important reason for this
decision is that some users do not have access to or cannot afford specialized
browsers, so general-purpose user agents must themselves be accessible.
This document explains which requirements the User Agent
Guidelines Working Group has chosen for general purpose user agents to
implement natively and why.
A user agent is a set of modules that retrieves Web resources, renders them,
allows the user to control of those processes, and communicates
with other software. For instance, a graphical desktop browser might consist
- A parser and a tree processing engine
- One or more rendering engines that, given a tree and style parameters,
creates rendering structures.
- A user interface for providing access to content. This includes:
- Navigation and search mechanisms, which allow the user to access content
other than sequentially.
- Orientation mechanisms such as proportional scroll bars, highlighting of
viewports, selection and focus, etc.
- A user interface for configuring the browser, including parameters of the
rendering engines(s), processing information such as natural language
preferences, and the user interface itself (e.g., position of buttons in the
- A user interface for facilitating browsing (history mechanism, bookmark
- Other user interface features (e.g., refresh the screen, reload the
document, send to the printer, etc.)
- Interpreters for scripting languages.
- APIs for communication with plug-ins.
- Interfaces (e.g., for HTTP, for DOM, for file i/o including document
loading and printing, communication with the operating system, etc.)
Note that there are areas where content and user interface mingle,
- Form controls
- Links and their styling
- Keyboard and other input configurations provided by the author or
overridden by the user
- Adoption of markup languages for implementation of UI controls (as is done,
e.g., by IE using HTML and as is done by Mozilla by using XML and the DOM for
For simplicity, I will consider for now that the UI refers to the UA's
components, not those contributed by Web content.
An assistive technology(AT) is a user agent that (1) relies on
another? is it always the host, or can it be another AT? user
agent to provide some services and (2) provides services beyond those offered
by the "host user agent" to meet the requirements of a target
audience. Additional services might include:
- Read access to the document tree would allow application of different
rendering engines. (e.g., speech output)
- Write access to the document tree would allow completion of forms through,
say, voice input?do you expect voice writing there, or writing after
- Read access to the UI would allow an assistive technology to know which
viewport the user has selected, user agent configuration settings, etc.
- Write access to the UI would allow users to navigate viewports (i.e.,
change the current viewport) through speech input ?commands from a
limited vocabulary, and possibly including speaker voice training, speech
recognition, and speech -to-text for narrative "do what I mean"
free-form command issuing?.
- Content transformation tools ?such as XSLT, ACSS, or
- Additional navigation mechanisms
- Additional orientation mechanisms
An assistive technology may not ?need to, or is prohibited
from? parse document source, for example, but probably has to include
tree processing ?(DOM, or proprietary to user agent)? capabilities
in order to offer additional functionalities.
Suggest that the AT be able to establish, access, augment, and modify
a separate profile for each user that contains user preferences, including user
interface controls, keyboard bindings, presentation rendering choices, and
possibly style sheets.
The following sections describe some of the factors that have affected the
decision about which functionalities should be supported natively by general
purpose user agents.
Some general-purpose user agents already provide useful functionalities such
as allowing users to navigate links through the keyboard. Assistive
technologies read the focus and speak or render as braille the link text. One
might argue that links are so fundamental to browsing the Web that it makes
sense to require navigation of these links to be a native functionality, I
believe that in part the current requirement is a perpetuation of existing
practice. Lynx offers direct access to links by number, not just sequential
access, and this has shown itself to be a useful time-saving functionality.
The existence of a platform- and programming-language independent way to
access content means that it's more understandable to ask AT developers to
provide some functionalities. Note, however, that the lack of a standard for
exposing the DOM may hinder adoption. Also, since assistive technology products
are usually designed to work with other software than user agents, requiring
them to implement a UA-specific interface may be considered burdensome.
Finally, there is not yet a platform-independent API for accessing user
No minimal functional requirement obvious
The WG attempted to identify "minimal requirements" a user agent
would have to satisfy to be considered accessible (at times, the bar is quite
high in fact). For some functionalities, minimal requirements were difficult or
impossible to identify, and therefore the WG chose either:
- To make a general requirement and leave the specifics to the Techniques
- To make no requirement and leave the job to assistive technologies.
One example of this includes table navigation. Access to table cells and
cell context (headers, neighboring cells, etc.) is very important to users and
there is a Priority 1 requirement that such information be made available to
users. However, navigation of table cells is just one (admittedly useful) means
to achieve the goals of access to content and orientation. Two problems present
- Requiring navigation through N-dimensional space (up, down, left, right)
frames the functionality in terms of the graphical artifact. For non-sighted
users or users with motor disabilities, requiring navigation through
two-dimensional space may not be an efficient way to access the information.
- Which navigation methods suffice? Sequential access alone? For large tables
(say 500x500 cells), this would surely be tedious and therefore direct access
(by row/column position) would be more effective. In addition, it would be
useful to be able to shrink or expand parts of the table, search on table
contents, identify all cells (possibly in N-dimensional space) under a
particular header, and all headers for a particular cell, etc. In short, the WG
recognizes the importance of navigation as a technique for making tables
accessible, but has not been able to identify a minimal requirement for
general-purpose user agents.
Likelihood of implementation
The requirements of the Guidelines are not independent of considerations of
implementability or cost. The Techniques Document represents the WG's efforts
at showing how each requirement may be implemented. However, the WG may have
chosen not to make certain requirements either because it seemed
"unreasonable" to ask desktop browsers to implement the functionality
or because the likelihood of implementation and conformance seemed low.
The Working Group has endeavored to incorporate feedback from users with
disabilities and experts in different fields related to accessibility about
important requirements for these guidelines.
The following review is based on the
20 December 1999
In order to provide rationale for requiring native support by general
purpose user agents of certain functionalities, I've grouped them by theme.
This grouping makes it relatively easy to understand why most of the
checkpoints require native support in general purpose user agents for the
functionalities in question. The themes are:
- The requirements of these checkpoints are applicable to
all user agents.
- The requirements of these checkpoints refer to content rendered natively by the user agent.
- The requirements of these checkpoints pertain to communication with assistive technologies and thus were
designed specifically for general purpose user agents.
- The requirements of these checkpoints are not
readily assignable to a particular class of user agent.
- The requirements of these checkpoints were considered to be
the responsibility of assistive technologies by the
All user agents ?why do all user agents need color, per 4.13 and
4.14? should meet these requirements, although how they are met will
depend on the type of user agent. These requirements concern device
independence, the native user interface and "to"should be
Why the order below? If not the original checkpoint order, I propose
reordering, as I have (just bold italic of the checkpoint part).
2.1 Ensure that the user has access to all content, including
alternative equivalentrepresentations for content.
6.1 Implement the accessibility features of supported specifications
(markup languages, style sheet languages, metadata languages, graphics formats,
11.1 Provide a version of the product documentation that conforms to
the Web Content Accessibility Guidelines.
11.2 Document all user agent features that promote accessibility.
11.3 Document the default input configuration (e.g., default keyboard
11.4 In a dedicated section?of what: help, installation,
alternative to paper form of manual?, document in a
self-sufficient mannerall features of the user agent that promote
accessibility. ?Presumably 11.1, 11.2, 11.3 apply in context where the
common use of a feature is described, and 11.4 to a collection of those
features, adequately described to stand alone?
8.4 Provide a mechanism for highlighting and identifying
(through a standard interface where available) the current viewport, selection,
and content focus. ?this seems redundant with the following, which add
user control. Should they be combined?
4.13 Allow the user to control how the selection is highlighted (e.g.,
foreground and background color).
4.14 Allow the user to control how the content focus is highlighted
(e.g., foreground and background color).
5.3 Implement selection, content focus, and user interface focus
mechanisms and make them available to users and through APIs.
1.3 Ensure that the user can interact with all active elements
in a device-independent manner.
7.1 Allow the user to navigate viewports (including frames).
4.15 Allow the user to control user agent-initiated spawned
7.2 For user agents that offer a browsing history mechanism, when the
user returns to a previous viewport, restore the point of regard in the
5.6 Follow operating system conventions and accessibility settings. In
particular, follow conventions for user interface design, default keyboard
configuration, product installation, and documentation.
10.6 Allow the user to configure the user agent in named profiles that
may be shared on systems with distinct user accounts or shared by the
same user portably across systems with the same operating system ?or do we
propose such profiles could be made independent of operating system?.
10.4 Use operating system conventions to indicate the input
10.5 Avoid default input configurations that interfere with operating
8.8 Provide a mechanism for highlighting and identifying (through a
standard interface where available) active elements.
9.5 When loading content (e.g., document, video clip, audio clip,
etc.) indicate what portion of the content has loaded and whether loading has
9.6 Indicate the relative position of the viewport in content (e.g.,
the percentage of an audio or video clip that has been played, the percentage
of a Web page that has been viewed, etc.).
8.9 Maintain consistent user agent behavior and default configurations
between software releases. Consistency is less important than accessibility and
adoption of operating system conventions.
10.7 Provide default input configurations for frequently performed
It makes sense for user agents to provide native support for content
2.2 For presentations that require user interaction within a specified
time interval, allow the user to control the time interval (e.g., by allowing
the user to pause and restart the presentation, to slow it down, etc.).
2.6 Allow the user to specify that captions and auditory descriptions
be rendered at the same time as the associated auditory and visual tracks.
This may require close synchronization, and coordinated pause and backup
as partially described in 4.5.
4.5 Allow the user to slow the presentation rate of audio,
video, and animations.
4.6 Allow the user to start, stop, pause, advance, and rewind
audio, video, and animations.
3.8 Allow the user to turn on and off rendering of images.
3.1 Allow the user to turn on and off rendering of background images.
3.2 Allow the user to turn on and off rendering of background audio.
3.3 Allow the user to turn on and off rendering of video.
3.4 Allow the user to turn on and off rendering of audio.
3.5 Allow the user to turn on and off animated or blinking text.
3.6 Allow the user to turn on and off animations and blinking images.
3.7 Allow the user to turn on and off support for scripts and applets.
4.1 Allow the user to control font family.
4.2 Allow the user to control the size of text.
4.3 Allow the user to control foreground color.
4.4 Allow the user to control background color.
4.8 Allow the user to control the position of captions on graphical
4.7 Allow the user to control the audio volume.
4.9 Allow the user to control synthesized speech playback
4.10 Allow the user to control synthesized speech volume.
4.11 Allow the user to control synthesized speech pitch,
gender, and other articulation characteristics.
4.12 Allow the user to select from available author and user style
sheets or ignore them.
2.5 If more than one alternative equivalent is available for content,
allow the user to choose from among the alternatives. This includes the choice
of viewing no alternatives. ?why "no alternatives"? If that is
appropriate, then how does that apply if there are no alternatives to start
with? Are you suggesting that the alternative equivalents are such as
attributes alt="..." longdesc="..." or the "d"
convention? If so, so state.
2.3 When no text equivalent has been supplied for an object, make
available author-supplied information to help identify the object (e.g., object
type, file name, etc.).
2.4 When a text equivalent for content is explicitly empty (i.e., an
empty string), render nothing.
2.7 For author-identified but unsupported natural languages, allow the
user to request notification of language changes in content.
These requirements were designed specifically for general purpose user
agents to ensure interoperability. They may also apply to user agents in
1.4 Ensure that every functionality offered through the user interface
is available through the standard keyboard API.
1.1 Ensure that every functionality offered through the user interface
is available through every input device API used by the user agent. User agents
are not required to reimplement low-level functionalities (e.g., for character
input or pointer motion) that are inherently bound to a particular API and most
naturally accomplished with that API.
1.2 Use the standard input and output device APIs of the operating
1.5 Ensure that all messages to the user (e.g., informational
messages, warnings, errors, etc.) are available through all output device APIs
used by the user agent. Do not bypass the standard output APIs when rendering
information (e.g., for reasons of speed, efficiency, etc.).
5.1 Provide programmatic read and write ?write -- doesn't this
require that every user agent become an authoring tool? access to
content attribute values, and structure by conforming to W3C
Document Object Model specifications.
5.2 Provide programmatic read and write access to user agent user
interface controls using standard APIs (e.g., platform-independent APIs,
standard APIs for the operating system, and conventions for programming
languages, plug-ins, virtual machine environments, etc.)
5.4 Provide programmatic notification of changes to content and user
interface controls (including selection, content focus, and user interface
9.1 Provide information about user agent-initiated content and
viewport changes through the user interface and through APIs
9.4 Allow the user to configure notification preferences for common
types of content and viewport changes.
9.2 Ensure that when the selection or content focus changes, it is in
a viewport after the change.
9.3 Prompt the user to confirm any form submission triggered
indirectly, that is by any means other than the user activating an explicit
form submit control.
5.5 Ensure that programmatic exchanges proceed in a timely manner.
10.1 Provide information directly to the user and through APIs about
current user preferences for input configurations (e.g., keyboard or voice
10.2 Provide information directly to the user and through APIs about
current author-specified input configurations (e.g., keyboard bindings
specified in content such as by "accesskey" in HTML 4.0) and
which ones are overridden by user preferences.
These checkpoints cannot be readily assignable to a particular class of user
The Working Group has generally considered navigation a technique
for providing access to content and context. People would probably agree that
without adequate navigation, access to content and context may be so slow as to
make the content unusable. However, there has not been agreement as to what
minimal navigation requirements (if any) should be made of general purpose user
agents. Below, some rationale is offered.
Include here definition we use of active elements.
7.3 Allow the user to navigate all active elements.
- Links are so important to the Web that general purpose user agents must
provide native support for navigation of them.
- Links and form controls add user interface to a page, thus it makes sense
that the user agent provide native support for this "imported" user
interface. But why limit "active elements" to links and form controls
and not tables, for example? Why links and form controls a priori?
7.4 Allow the user to navigate just among all active elements.
- This is just a special case of 7.3 and so once 7.3 is settled, this one
7.5 Allow the user to search for rendered text content, including text
equivalents of visual and auditory content.
- Most user agents do this anyway (except for the text equivalents,
should that be a user preference to search/not search such text equivalents?
alt content part).
- Searching might be considered a minimal form of navigation.
7.6 Allow the user to navigate according to structure.
- This checkpoint has been included as an umbrella checkpointtypo,
omit final nn because there are manyomit second many
navigation possibilities. Rather than list all of them (per element type, by
tree structure, by element content, etc.), the WG put them all into a single,
Priority 2, intentionally ambiguous checkpoint.
7.7 Allow the user to configure structured navigation.
- This one follows 7.6
8.1 Convey the author-specified purpose of each table and the
relationships among the table cells and headers.
8.5 Provide a "outline" view of content, built from
structural elements (e.g., frames, headers, lists, forms, tables, etc.)
8.6 Allow the user to configure the outline view.
- This one follows 8.5.
These may apply to all user agents.
10.3 Allow the user to change and control the input configuration.
Allow the user to configure the user agent so that some functionalities may be
activated with a single command (e.g., single key, single voice command, etc.).
10.8 Allow the user to configure the arrangement of graphical user
agent user interface controls.
The Working Group has decided that the following requirements, once
checkpoints, belonged to assistive technologies. These requirements are listed
Appendix of Assistive Technology Functionalities of the
1999 Techniques Document.
- Allow users to navigate up/down and among the cells of a table (e.g., by
using the focus to designate a selected table cell).
- Indicate the row and column dimensions of a selected table.
- Describe a selected element's position within larger structures (e.g.,
numerical or relative position in a document, table, list, etc.).
- Provide information about form structure and navigation (e.g., groups of
controls, control labels, navigation order, and keyboard configuration).
- Enable announcing of information regarding title, value, grouping, type,
status and position of specific focused elements.