User Agent Responsibilities

2000-01-05

This version: Harvey Bingham proposed revisions, in bold italics.: http://www.tiac.net/users/bingham/accessbl/ur000105.htm

based on: http://www.w3.org/WAI/UA/1999/12/ua-resp-19991228

Editor: Ian Jacobs, W3C

Abstract

A user agent is a set of modules that retrieves Web resources, renders them, allows control of those processes, and communicates with other software. An assistive technology is a user agent that (1) relies on another user agent to provide some services and (2) provides services beyond those offered by the "host user agent" to meet the requirements of a target audience. This document examines User Agent Guidelines Working Group rationale for establishing which user agent functionalities must be supported natively by general purpose user agents and which are expected to be supported by assistive technologies.

Status of this Document

This document does not represent consensus of the User Agent Accessibility Guidelines Working Group. As of the date at the top of the document, it only represents the musings of Ian Jacobs, who hopes it will serve as a support document for the User Agent Accessibility Guidelines as they advance on the Recommendation track.

Introduction
What is a User Agent?
What is an Assistive Technology?
What functionalities must be provided by a general-purpose UA?
Review of specific requirements

Introduction

The User Agent Accessibility Guidelines include two types of requirements for general purpose user agents:

Requirements for native implementation of some functionalities (i.e., the user agent implements them without requiring additional software other than the operating system).
Communication through Application Programming Interfaces (APIs). The Guidelines require user agents to allow read and write ?audio and video too? access to both content and user interface (UI) controls.

The second set of requirements allows assistive technologies (ATs) to offer missing functionalities not offered natively. Since the Guidelines require that ATs have access to both content and UI controls, in theory, general-purpose user agents have to implement natively few functionalities related to accessibility since ATs can fill in the gaps. They might even do a better job since developers of specialized tools know their target audience well. These specialized ATs are commonly referred to as plug-ins.

The Working Group has decided that general-purpose user agents must implement some important functionalities natively rather than relying on Assistive Technologies to shoulder the load. One important reason for this decision is that some users do not have access to or cannot afford specialized browsers, so general-purpose user agents must themselves be accessible.

This document explains which requirements the User Agent Guidelines Working Group has chosen for general purpose user agents to implement natively and why.

What is a User Agent?

A user agent is a set of modules that retrieves Web resources, renders them, allows the user to control of those processes, and communicates with other software. For instance, a graphical desktop browser might consist of:

A parser and a tree processing engine
One or more rendering engines that, given a tree and style parameters, creates rendering structures.
A user interface for providing access to content. This includes:
- Viewports
- Navigation and search mechanisms, which allow the user to access content other than sequentially.
- Orientation mechanisms such as proportional scroll bars, highlighting of viewports, selection and focus, etc.
A user interface for configuring the browser, including parameters of the rendering engines(s), processing information such as natural language preferences, and the user interface itself (e.g., position of buttons in the GUI).
A user interface for facilitating browsing (history mechanism, bookmark mechanism, etc.)
Other user interface features (e.g., refresh the screen, reload the document, send to the printer, etc.)
Interpreters for scripting languages.
APIs for communication with plug-ins.
Interfaces (e.g., for HTTP, for DOM, for file i/o including document loading and printing, communication with the operating system, etc.)

Note that there are areas where content and user interface mingle, including:

Form controls
Links and their styling
Keyboard and other input configurations provided by the author or overridden by the user
Adoption of markup languages for implementation of UI controls (as is done, e.g., by IE using HTML and as is done by Mozilla by using XML and the DOM for UI controls).

For simplicity, I will consider for now that the UI refers to the UA's components, not those contributed by Web content.

What is an assistive technology?

An assistive technology(AT) is a user agent that (1) relies on another? is it always the host, or can it be another AT? user agent to provide some services and (2) provides services beyond those offered by the "host user agent" to meet the requirements of a target audience. Additional services might include:

Read access to the document tree would allow application of different rendering engines. (e.g., speech output)
Write access to the document tree would allow completion of forms through, say, voice input?do you expect voice writing there, or writing after voice-to-text conversion?
Read access to the UI would allow an assistive technology to know which viewport the user has selected, user agent configuration settings, etc.
Write access to the UI would allow users to navigate viewports (i.e., change the current viewport) through speech input ?commands from a limited vocabulary, and possibly including speaker voice training, speech recognition, and speech -to-text for narrative "do what I mean" free-form command issuing?.
Content transformation tools ?such as XSLT, ACSS, or stylesheets?
Additional navigation mechanisms
Additional orientation mechanisms

An assistive technology may not ?need to, or is prohibited from? parse document source, for example, but probably has to include tree processing ?(DOM, or proprietary to user agent)? capabilities in order to offer additional functionalities.

Suggest that the AT be able to establish, access, augment, and modify a separate profile for each user that contains user preferences, including user interface controls, keyboard bindings, presentation rendering choices, and possibly style sheets.

What functionalities must be provided by a general-purpose UA?

The following sections describe some of the factors that have affected the decision about which functionalities should be supported natively by general purpose user agents.

Existing practice

Some general-purpose user agents already provide useful functionalities such as allowing users to navigate links through the keyboard. Assistive technologies read the focus and speak or render as braille the link text. One might argue that links are so fundamental to browsing the Web that it makes sense to require navigation of these links to be a native functionality, I believe that in part the current requirement is a perpetuation of existing practice. Lynx offers direct access to links by number, not just sequential access, and this has shown itself to be a useful time-saving functionality.

The DOM

The existence of a platform- and programming-language independent way to access content means that it's more understandable to ask AT developers to provide some functionalities. Note, however, that the lack of a standard for exposing the DOM may hinder adoption. Also, since assistive technology products are usually designed to work with other software than user agents, requiring them to implement a UA-specific interface may be considered burdensome. Finally, there is not yet a platform-independent API for accessing user interface controls.

No minimal functional requirement obvious

The WG attempted to identify "minimal requirements" a user agent would have to satisfy to be considered accessible (at times, the bar is quite high in fact). For some functionalities, minimal requirements were difficult or impossible to identify, and therefore the WG chose either:

To make a general requirement and leave the specifics to the Techniques Document, or
To make no requirement and leave the job to assistive technologies.

One example of this includes table navigation. Access to table cells and cell context (headers, neighboring cells, etc.) is very important to users and there is a Priority 1 requirement that such information be made available to users. However, navigation of table cells is just one (admittedly useful) means to achieve the goals of access to content and orientation. Two problems present themselves, however:

Requiring navigation through N-dimensional space (up, down, left, right) frames the functionality in terms of the graphical artifact. For non-sighted users or users with motor disabilities, requiring navigation through two-dimensional space may not be an efficient way to access the information.
Which navigation methods suffice? Sequential access alone? For large tables (say 500x500 cells), this would surely be tedious and therefore direct access (by row/column position) would be more effective. In addition, it would be useful to be able to shrink or expand parts of the table, search on table contents, identify all cells (possibly in N-dimensional space) under a particular header, and all headers for a particular cell, etc. In short, the WG recognizes the importance of navigation as a technique for making tables accessible, but has not been able to identify a minimal requirement for general-purpose user agents.

Likelihood of implementation

The requirements of the Guidelines are not independent of considerations of implementability or cost. The Techniques Document represents the WG's efforts at showing how each requirement may be implemented. However, the WG may have chosen not to make certain requirements either because it seemed "unreasonable" to ask desktop browsers to implement the functionality or because the likelihood of implementation and conformance seemed low.

User/Expert Experience

The Working Group has endeavored to incorporate feedback from users with disabilities and experts in different fields related to accessibility about important requirements for these guidelines.

Review of specific requirements

The following review is based on the 20 December 1999 UA Guidelines.

In order to provide rationale for requiring native support by general purpose user agents of certain functionalities, I've grouped them by theme. This grouping makes it relatively easy to understand why most of the checkpoints require native support in general purpose user agents for the functionalities in question. The themes are:

The requirements of these checkpoints are applicable to all user agents.
The requirements of these checkpoints refer to content rendered natively by the user agent.
The requirements of these checkpoints pertain to communication with assistive technologies and thus were designed specifically for general purpose user agents.
The requirements of these checkpoints are not readily assignable to a particular class of user agent.
The requirements of these checkpoints were considered to be the responsibility of assistive technologies by the Working Group.

Requirements applicable to all user agents

All user agents ?why do all user agents need color, per 4.13 and 4.14? should meet these requirements, although how they are met will depend on the type of user agent. These requirements concern device independence, the native user interface and "to"should be "the" documentation.

Why the order below? If not the original checkpoint order, I propose reordering, as I have (just bold italic of the checkpoint part).

Checkpoint 2.1 Ensure that the user has access to all content, including alternative equivalentrepresentations for content.
Checkpoint 6.1 Implement the accessibility features of supported specifications (markup languages, style sheet languages, metadata languages, graphics formats, etc.).
Checkpoint 11.1 Provide a version of the product documentation that conforms to the Web Content Accessibility Guidelines.
Checkpoint 11.2 Document all user agent features that promote accessibility.
Checkpoint 11.3 Document the default input configuration (e.g., default keyboard bindings).
Checkpoint 11.4 In a dedicated section?of what: help, installation, alternative to paper form of manual?, document in a self-sufficient mannerall features of the user agent that promote accessibility. ?Presumably 11.1, 11.2, 11.3 apply in context where the common use of a feature is described, and 11.4 to a collection of those features, adequately described to stand alone?
Checkpoint 8.4 Provide a mechanism for highlighting and identifying (through a standard interface where available) the current viewport, selection, and content focus. ?this seems redundant with the following, which add user control. Should they be combined?
Checkpoint 4.13 Allow the user to control how the selection is highlighted (e.g., foreground and background color).
Checkpoint 4.14 Allow the user to control how the content focus is highlighted (e.g., foreground and background color).
Checkpoint 5.3 Implement selection, content focus, and user interface focus mechanisms and make them available to users and through APIs.
Checkpoint 1.3 Ensure that the user can interact with all active elements in a device-independent manner.
Checkpoint 7.1 Allow the user to navigate viewports (including frames).
Checkpoint 4.15 Allow the user to control user agent-initiated spawned viewports.
Checkpoint 7.2 For user agents that offer a browsing history mechanism, when the user returns to a previous viewport, restore the point of regard in the viewport.
Checkpoint 5.6 Follow operating system conventions and accessibility settings. In particular, follow conventions for user interface design, default keyboard configuration, product installation, and documentation.
Checkpoint 10.6 Allow the user to configure the user agent in named profiles that may be shared on systems with distinct user accounts or shared by the same user portably across systems with the same operating system ?or do we propose such profiles could be made independent of operating system?.
Checkpoint 10.4 Use operating system conventions to indicate the input configuration.
Checkpoint 10.5 Avoid default input configurations that interfere with operating system conventions.
Checkpoint 8.8 Provide a mechanism for highlighting and identifying (through a standard interface where available) active elements.
Checkpoint 9.5 When loading content (e.g., document, video clip, audio clip, etc.) indicate what portion of the content has loaded and whether loading has stalled.
Checkpoint 9.6 Indicate the relative position of the viewport in content (e.g., the percentage of an audio or video clip that has been played, the percentage of a Web page that has been viewed, etc.).
Checkpoint 8.9 Maintain consistent user agent behavior and default configurations between software releases. Consistency is less important than accessibility and adoption of operating system conventions.
Checkpoint 10.7 Provide default input configurations for frequently performed tasks.

Requirements for content rendered natively

It makes sense for user agents to provide native support for content rendered natively.

Checkpoint 2.2 For presentations that require user interaction within a specified time interval, allow the user to control the time interval (e.g., by allowing the user to pause and restart the presentation, to slow it down, etc.).
Checkpoint 2.6 Allow the user to specify that captions and auditory descriptions be rendered at the same time as the associated auditory and visual tracks. This may require close synchronization, and coordinated pause and backup as partially described in 4.5.
Checkpoint 4.5 Allow the user to slow the presentation rate of audio, video, and animations.
Checkpoint 4.6 Allow the user to start, stop, pause, advance, and rewind audio, video, and animations.
Checkpoint 3.8 Allow the user to turn on and off rendering of images.
Checkpoint 3.1 Allow the user to turn on and off rendering of background images.
Checkpoint 3.2 Allow the user to turn on and off rendering of background audio.
Checkpoint 3.3 Allow the user to turn on and off rendering of video.
Checkpoint 3.4 Allow the user to turn on and off rendering of audio.
Checkpoint 3.5 Allow the user to turn on and off animated or blinking text.
Checkpoint 3.6 Allow the user to turn on and off animations and blinking images.
Checkpoint 3.7 Allow the user to turn on and off support for scripts and applets.
Checkpoint 4.1 Allow the user to control font family.
Checkpoint 4.2 Allow the user to control the size of text.
Checkpoint 4.3 Allow the user to control foreground color.
Checkpoint 4.4 Allow the user to control background color.
Checkpoint 4.8 Allow the user to control the position of captions on graphical displays.
Checkpoint 4.7 Allow the user to control the audio volume.
Checkpoint 4.9 Allow the user to control synthesized speech playback rate.
Checkpoint 4.10 Allow the user to control synthesized speech volume.
Checkpoint 4.11 Allow the user to control synthesized speech pitch, gender, and other articulation characteristics.
Checkpoint 4.12 Allow the user to select from available author and user style sheets or ignore them.
Checkpoint 2.5 If more than one alternative equivalent is available for content, allow the user to choose from among the alternatives. This includes the choice of viewing no alternatives. ?why "no alternatives"? If that is appropriate, then how does that apply if there are no alternatives to start with? Are you suggesting that the alternative equivalents are such as attributes alt="..." longdesc="..." or the "d" convention? If so, so state.
Checkpoint 2.3 When no text equivalent has been supplied for an object, make available author-supplied information to help identify the object (e.g., object type, file name, etc.).
Checkpoint 2.4 When a text equivalent for content is explicitly empty (i.e., an empty string), render nothing.
Checkpoint 2.7 For author-identified but unsupported natural languages, allow the user to request notification of language changes in content.

Requirements for communication

These requirements were designed specifically for general purpose user agents to ensure interoperability. They may also apply to user agents in general.

Checkpoint 1.4 Ensure that every functionality offered through the user interface is available through the standard keyboard API.
Checkpoint 1.1 Ensure that every functionality offered through the user interface is available through every input device API used by the user agent. User agents are not required to reimplement low-level functionalities (e.g., for character input or pointer motion) that are inherently bound to a particular API and most naturally accomplished with that API.
Checkpoint 1.2 Use the standard input and output device APIs of the operating system.
Checkpoint 1.5 Ensure that all messages to the user (e.g., informational messages, warnings, errors, etc.) are available through all output device APIs used by the user agent. Do not bypass the standard output APIs when rendering information (e.g., for reasons of speed, efficiency, etc.).
Checkpoint 5.1 Provide programmatic read and write ?write -- doesn't this require that every user agent become an authoring tool? access to content attribute values, and structure by conforming to W3C Document Object Model specifications.
Checkpoint 5.2 Provide programmatic read and write access to user agent user interface controls using standard APIs (e.g., platform-independent APIs, standard APIs for the operating system, and conventions for programming languages, plug-ins, virtual machine environments, etc.)
Checkpoint 5.4 Provide programmatic notification of changes to content and user interface controls (including selection, content focus, and user interface focus).
Checkpoint 9.1 Provide information about user agent-initiated content and viewport changes through the user interface and through APIs
Checkpoint 9.4 Allow the user to configure notification preferences for common types of content and viewport changes.
Checkpoint 9.2 Ensure that when the selection or content focus changes, it is in a viewport after the change.
Checkpoint 9.3 Prompt the user to confirm any form submission triggered indirectly, that is by any means other than the user activating an explicit form submit control.
Checkpoint 5.5 Ensure that programmatic exchanges proceed in a timely manner.
Checkpoint 10.1 Provide information directly to the user and through APIs about current user preferences for input configurations (e.g., keyboard or voice bindings).
Checkpoint 10.2 Provide information directly to the user and through APIs about current author-specified input configurations (e.g., keyboard bindings specified in content such as by "accesskey" in HTML 4.0) and which ones are overridden by user preferences.

Unknown

These checkpoints cannot be readily assignable to a particular class of user agent.

Navigation checkpoints

The Working Group has generally considered navigation a technique for providing access to content and context. People would probably agree that without adequate navigation, access to content and context may be so slow as to make the content unusable. However, there has not been agreement as to what minimal navigation requirements (if any) should be made of general purpose user agents. Below, some rationale is offered.

Include here definition we use of active elements.

Checkpoint 7.3 Allow the user to navigate all active elements.

Links are so important to the Web that general purpose user agents must provide native support for navigation of them.
Links and form controls add user interface to a page, thus it makes sense that the user agent provide native support for this "imported" user interface. But why limit "active elements" to links and form controls and not tables, for example? Why links and form controls a priori?

Checkpoint 7.4 Allow the user to navigate just among all active elements.

This is just a special case of 7.3 and so once 7.3 is settled, this one will follow.

Checkpoint 7.5 Allow the user to search for rendered text content, including text equivalents of visual and auditory content.

Most user agents do this anyway (except for the text equivalents, should that be a user preference to search/not search such text equivalents? alt content part).
Searching might be considered a minimal form of navigation.

Checkpoint 7.6 Allow the user to navigate according to structure.

This checkpoint has been included as an umbrella checkpointtypo, omit final nn because there are manyomit second many navigation possibilities. Rather than list all of them (per element type, by tree structure, by element content, etc.), the WG put them all into a single, Priority 2, intentionally ambiguous checkpoint.

Checkpoint 7.7 Allow the user to configure structured navigation.

This one follows 7.6

Context/orientation checkpoints

Checkpoint 8.1 Convey the author-specified purpose of each table and the relationships among the table cells and headers.
Checkpoint 8.5 Provide a "outline" view of content, built from structural elements (e.g., frames, headers, lists, forms, tables, etc.)
Checkpoint 8.6 Allow the user to configure the outline view.: This one follows 8.5.

Configuration checkpoints

These may apply to all user agents.

Checkpoint 10.3 Allow the user to change and control the input configuration. Allow the user to configure the user agent so that some functionalities may be activated with a single command (e.g., single key, single voice command, etc.).
Checkpoint 10.8 Allow the user to configure the arrangement of graphical user agent user interface controls.

Requirements for Assistive Technologies

The Working Group has decided that the following requirements, once checkpoints, belonged to assistive technologies. These requirements are listed in the Appendix of Assistive Technology Functionalities of the 20 December 1999 Techniques Document.

Navigation checkpoints

Allow users to navigate up/down and among the cells of a table (e.g., by using the focus to designate a selected table cell).

Context/Orientation checkpoints

Indicate the row and column dimensions of a selected table.
Describe a selected element's position within larger structures (e.g., numerical or relative position in a document, table, list, etc.).
Provide information about form structure and navigation (e.g., groups of controls, control labels, navigation order, and keyboard configuration).
Enable announcing of information regarding title, value, grouping, type, status and position of specific focused elements.

User Agent Responsibilities

2000-01-05

Abstract

Status of this Document

Table of Contents

Existing practice

The DOM

No minimal functional requirement obvious

Likelihood of implementation

User/Expert Experience

Navigation checkpoints

Context/orientation checkpoints

Configuration checkpoints

Navigation checkpoints

Context/Orientation checkpoints