Re: 'content' in WAI documents

Al Gilman wrote:
> 
> Gregg Vanderheiden forwarded your question about what 'content' should be
> taken to mean as it is used in the WAI context to the wai-xtech list.  This
> term is certainly worth some more explaining.  The wai-xtech list is a
> quiet corner where we discuss miscellaneous cross-group detailed issues,
> including particularly the alignment of terminology.
> 
> You and/or your implementer friends might want to subscribe to the list
> while this is being discussed there.  You can also follow the discussions
> in the Web archive for the list.

I'd like to say a few words on the topic of "content", which is
defined in the User Agent Accessibility Guidelines 1.0 (UAAG 1.0) 
[1] to mean essentially "the document object". These are a few of the
considerations on this topic made by the User Agent Accessibility
Guidelines 
Working Group (UAWG). Please refer to the glossary of UAAG 1.0 for more
information (and other terms that may be relevant to this discussion,
such as "interactive element", "equivalent", text content, etc.). 

1) Information v. Markup. The UAWG realized at some point in the
development
of UAAG 1.0 that some people were using the term "content" to refer to 
information (as in "that Web site has a lot of good content"), while
others 
were using it to refer to markup, style sheets, character data, images,
etc.
The UAWG has chosen not to use "content" to mean "information" for a
couple
of reasons at least:

 a) UAAG 1.0 is aimed at user agent developers and
    only makes requirements for what the user agent can "recognize" in 
    markup, etc. For this reason, it makes more sense for UAAG 1.0 to
    use "content" to mean "what's actually in the document object".

 b) The user agent may repair bytes received from the server (e.g.,
invalid
    markup). In this case, the resulting document object is not strictly 
    what the author sent, but also what the user agent has added or
repaired. 
    Therefore, we don't define content simply to be "that which comes
over 
    the wire". Except in very few cases, UAAG 1.0 requirements start
after the 
    construction of the document object. The exceptions are a few repair 
    checkpoints that are labeled as such. 

On the other hand, it may make sense in WCAG to allow the "information" 
meaning as some of the WCAG requirements are about abstractions, not
markup 
(e.g., "Use the clearest and simplest language appropriate for a site's
content" or "Describe the purpose of frames").  WCAG tells authors (in
part)
what kind of information to encode, while UAAG 1.0 tells developers what
to look for in markup that will be a strong indicator (but not a
guarantee)
of the author's intention.

2) Markup v. Natural language information. I believe that there
have been some suggestions to consider "content" to be the text content
between an element's start and end tags, and to consider the rest to
be markup (and not content). The UAWG has not found this distinction
practicable as it does not account for image, video, or audio content, 
effects caused by style sheets and scripts, and more. Instead of 
distinguishing markup from text content, the UAWG has chosen to consider 
"rendered content" a subclass of "content". Rendered content is defined
to be:

   "... the part of content that the user agent makes available to 
    the user's senses of sight and hearing (and only those senses for 
    the purposes of this document). Any content that causes an effect
that
    may be perceived through these senses constitutes rendered content.
This 
    includes text characters, images, style sheets, scripts, and
anything 
    else in content that, once processed, may be perceived through sight 
    and hearing. "

UAAG 1.0 makes requirements on content and rendered content, but not
on markup specifically. There are some requirements in UAAG 1.0 for
content types that may be identified in markup, and for style sheets and
other information recognized as style (in markup), and for scripts, etc.

Some content that may not be rendered content:
elements with 'display: none' or 'visibility: hidden' set for
them, unprocessed style rules (or those that don't win in the cascade),
scripts that only perform calculations but do not cause rendering, etc.

Specifically about invisible and silent content, the UAAG 1.0 definition
states:

  In the context of this document, "invisible content" is content that 
  influences graphical rendering of other content but is not rendered 
  itself. Similarly, "silent content" is content that influences audio 
  rendering of other content but is not rendered itself. Neither
invisible 
  nor silent content is considered rendered content. 

3) Content v. User Interface. UAAG 1.0 has requirements that apply to
either
content (the document object), or to user agent user interface features
(i.e.,
those delivered with the software), or both. Each checkpoint includes a 
label to indicate the scope of the requirement (for content, user agent,
or
both). Why is this important? Some content (e.g., form controls)
contributes 
to the user interface. To users, there is only one user interface. But
in
UAAG 1.0, there is a distinction between user interface components that
come from content versus those from the developer because the UAWG
considers
that UA developers have total control over the native user interface,
and
may only have partial control over what the author provides. Of course,
UA developers have total control in one sense as they are writing the
software. But authors may encode their knowledge imperfectly or
incorrectly,
and therefore some UAAG 1.0 requirements may differ depending on the
origin of the user interface.

4) Conditional content. Because not all content is rendered at all times
or by default (e.g., "alt" attribute values in HTML), and because WCAG
may
instruct authors to provide accessibility information via elements or
attributes that are not rendered all the time or by default, UAAG 1.0 
includes requirements to provide access to this "conditional content",
which
is defined to be:

  "content that, by specification, should be made available to users 
   through the user interface, generally under certain conditions 
   (e.g., based on user preferences or operating environment
limitations)."

It is important to note the "by specification" part. UAAG 1.0 does not
require user agent developers to guess the author's intention. UAAG 1.0
requirements stop at what can be recognized by specification from
markup, 
style sheets, scripts, and any other parts of the document object.

5) Content meant for humans v. for machines. UAAG 1.0 requirements 
do not distinguish content meant for humans (text, images, etc.) 
from content meant for machines (e.g., scripts), as the UAWG was unable
to draw the line. All bits and bytes are meant for machines, and some
are meant for humans after rendering. The UAWG was unable to define
"what is meant for humans", and doesn't really care since some people
may find useful information by reading the source of a script. To UAAG
1.0,
all content must be available to humans, either rendered according to
specification, or through other means required by UAAG (e.g., a text
source view for text formats). UAAG 1.0 does, however, lean in the
direction of "what is meant for humans" in its definitions of 
rendered content and conditional content. 

6) Before rendering v. after rendering. Consider the term "non-text
element" as used in WCAG 1.0, checkpoint 1.1. Does non-text mean
that the element is not composed of text in the document object,
or that after rendering the result is not text? An SVG image is
built using text characters, but when processed according to at
least one algorithm in the SVG specification, the result is an
image. So does the checkpoint 1.1 requirement mean that authors
must provide text equivalents for content that is non-text in
the source, or non-text after rendering, or both?

UAAG 1.0 uses "content" to mean prior to rendering (the document
object) and "rendered content" to mean that part of content that
made available through a viewport.

7) Primary v. Alternative content. UAAG 1.0 does not distinguish between 
"primary" and "alternative" content in the following sense: that
"primary"
content is intended by authors for users without a disability and
"alternative"
is intended by authors for users with a disability. Though authors may
in 
fact intend some content for users with a disability, UAAG 1.0 makes no 
correlation between the author's intent and the potential for content to
be 
accessible. UAAG 1.0 doesn't have to, since user agents are required to
make all content available (through a number of mechanisms).

I hope these notes help. For some other issues that the UAWG has
wrestled
with, refer to "Hurdles of UAAG 1.0". Please note, however, that this
document
is not being maintained actively.

Thank you,

 - Ian

[1] http://www.w3.org/TR/UAAG10/
[2] http://www.w3.org/WAI/UA/2000/10/hurdles
-- 
Ian Jacobs (ij@w3.org)   http://www.w3.org/People/Jacobs
Tel:                     +1 831 457-2842
Cell:                    +1 917 450-8783

Received on Monday, 25 June 2001 10:33:50 UTC