WCAG 2 Restructuring Proposal comments

The document is *quite* an improvement and is the first set of 
proposals for WCAG 2.0 that does not contain clauses that render me 
apoplectic. I was just waiting to see how far y'all would go with the 
write-simply and illustrate-everything requirements before I went 
medieval on your arses. Apparently good sense is winning out.



>Guideline 1 - Perceivable.
>Ensure that all content can be presented in form(s) that can be 
>perceived by any user - except those aspects of the content that 
>cannot be expressed in words.

Well, everything can be "expressed" in words. You can always say 
*something*. Perhaps better terminology would be:

	except content components with no equivalent in words
	except content components that cannot be epitomized in words



>A text equivalent
>* serves the same function as the non-text content.
>  * communicates the same information as the non-text content.

No, this is a serious deficiency. Pictures (e.g.) by definition do 
not communicate "the same" information as text. That's why _The 
Shipping News_ the book and _The Shipping News_ the movie are 
different things (and why _Hard Core Logo_ the book, movie, and 
graphic novel are three separate things).

The issue here is that the words express as much of the same 
information *as words reasonably can*.

Let's also use "text" to mean "words" as infrequently as possible.


>Non-text content includes but is not limited to images, text in 
>raster images, image map regions, animations (e.g., animated GIFs), 
>applets and programmatic objects, ASCII art, scripts that present 
>content, images used as list bullets,

Images used as list bullets have *no* text equivalents in HTML *at all*.

	list-style: image(bullet.gif)

has no alt text or anything like it. There are ways to provide 
fallback mechanisms [list-style: image(bullet.gif) disc] but that is 
not the same as providing an alt text for the bullet, *which you 
cannot do*.

It *is* possible to list a bunch of paragraphs with <img> elements 
that, when assembled, resemble a list, but they won't be coded as 
<ol> or <ul>.



>spacers, graphical buttons, sounds (played with or without user 
>interaction), stand-alone audio files, audio tracks of video, and 
>video.

These need to be reordered. "Audio tracks of video, and video" has 
never been easy to understand and is clumsy as hell. Instead, write 
all the examples in alphabetical order *as a list*. (Eat your own 
vegetarian dogfood.)


>Benefits (informative)
>
>Individuals who are blind, have low vision, have cognitive 
>disabilities or have trouble reading text for any reason can have 
>the text read aloud to them. Individuals who are deaf, are hard of 
>hearing or who are having trouble understanding the audio 
>information for any reason can read the text presentation or have it 
>translated and presented as sign language.

Well, I don't know where you're going with that last part. If you're 
imagining a video file with audio and captions, are you imagining 
some interpreter somewhere who is translating the *audio* into sign 
language? (They won't be translating the captions. Now, a 
foreign-language subtitle is something different.)

Where is this interpreter? In the frame? Outside of the frame? 
Sitting in the room with you?


>A bar chart compares how many widgets were sold in June, July, and 
>August. The short label says, "Graph of the numbers of widgets sold 
>in June, July, and August." The longer explanation provides the data 
>presented in the chart.

This has always been a terrible example, one of many in WCAG that 
show that WAI members do not work in the real world.

We use charts and graphs because it is inconvenient or impossible to 
make an issue understandable in words. Or charts and graphs are 
simply *better*. Like packing, unpacking, and repacking a suitcase, 
not everything that originates in graphical form can be re-expressed 
in words *and still be understood as well*.

Moreover, proponents of this example have not spent enough time 
interpreting scientific charts and graphs, some of which accumulate 
hundreds or thousands of data points. Are you proposing to include 
the hundreds or thousands of data points in the longdesc?

You need to find a better example, one *actually encountered* on a 
*real* Web site and not a half-arsed fabrication.


>* Example 3: a short label and a longer explanation of animation.
>An animation shows how to tie a knot. The short label says, "An 
>animation showing how to tie a square knot."

No, that's too meta. The text equivalent should be simply "How to tie 
a square knot." I just had Jukka Korpela upbraiding me on this very 
same point. Next you're going to want simple <img> alt texts to say 
something like "Text of image of photo of representation of man 
sitting in chair" rather than just "Man sitting in chair."


>Example 5: a label for content that cannot be described in words.
>An audio file is embedded in a Web page. The short label says, 
>"Beethoven's 5th Symphony performed by the Chicago Symphony 
>Orchestra."

You just *have* described it in words. *Everything* can be described 
in words. The issue is that Beethoven's _Fifth_ (write the numbers in 
words in <cite> to render symphony names) cannot be rendered or 
epitomized in or translated into words.


>You will have successfully provided synchronized media equivalents 
>for time-dependent presentations if:
>1. an audio description of all significant visual information in
>scenes, actions and events is provided to the extent possible given 
>the constraints posed by the existing audio track and constraints on 
>freezing the image to insert additional auditory description.


You don't need the bit after "extent possible."


>[Issue: this set does not yet require provision of media-equivalents 
>for live broadcast. The reason is that it is not clear how to handle 
>problems with webcams, etc. If someone points a webcam out the 
>window, or at the coffeee pot -- do they need to hire someone to 
>provide a running commentarey? We should discuss and decide to 
>require or not. UNDER WHAT CIRCUMSTANCES WOULD WE REQUIRE 
>EQUIVALENTS == AND WHEN WOULD WE NOT - AND STILL ALLOW COMPLIANCE 
>WITH THIS CHECKPOINT]

"Media that update slowly or that are not intended to be followed 
continuously (e.g., Webcams)  may be summarized with simple text 
equivalents." A Webcam is merely a delivery mechanism for individual 
still images. Treat them accordingly.


>Media equivalents present essential audio information visually 
>(captions) and essential video information auditorily (audio 
>descriptions).

Thank you for retiring the malapropism "auditory description."


>* captions are text equivalents of auditory information from speech, 
>sound effects, and ambient sounds that are synchronized with the 
>multimedia presentation.
>* audio descriptions are equivalents of visual information from 
>actions, body language, graphics, and scene changes that are voiced 
>(either by a human or a speech synthesizer)

I would like the "(either by a human or a speech synthesizer)" bit to 
disappear. Nobody is using speech synthesis for audio description 
yet, and frankly, I want to discourage the practice. It may 
eventually be necessary, but we aren't there yet. It is vapourware.


>People without disabilities also benefit from the media equivalents. 
>People in noisy environments or with muted sound often use captions. 
>Captions are used by many to develop language and reading skills. 
>Audio descriptions also provide visual information for people who 
>are temporarily looking away from the video presentation such as 
>when following an instructional video and looking at their hands.

BTW, WGBH has research suggesting that learning-disabled kids benefit 
in a very modest way from audio description. There's also a CSUN 
paper:

<http://www.csun.edu/cod/conf2002/proceedings/48.htm>


>Note:Time-dependant presentations that require dual, simultaneous 
>attention with a single sense can present significant barriers to 
>some users.

Very, very few users!


>Depending on the nature of the of presentation, it may be possible 
>to avoid scenarios where, for example, a deaf user would be required 
>to watch an action on the screen and read the captions at the same 
>time.

That's not how captioning works!

Yet again we see a lack of real-world experience. The top two 
complaints of caption-naive hearing people are:

1. The captions and the utterances don't match! (Even if they differ 
by a single word or a missed "um," hearing people go nuts.)

2. They're "distracting"! I can't follow the picture! (Research shows 
that even caption-naive people dramatically improve their ability to 
balance caption- and image-watching after only two hours of captioned 
TV.)

This is a *nonissue* for everyone who does not have a learning 
disability (or, I suppose, a peripheral-vision defect). Multimedia 
captions *do* let you pause the video and keep the caption up (an 
improvement over TV closed captioning). It is better to emphasize 
that advantage than to come this close to requiring people to caption 
in a way that never has been done before on TV, on video, or at the 
movies. As written, *no* captioning on TV or home video or in cinemas 
would qualify.

I don't think the authors of this draft have enough lived experience 
watching captioned shows or enough familiarity with the research.


>Where possible, provide content so that it does not require dual, 
>simultaneous attention or so that it gives the user the ability to 
>effectively control/pause different media signals.

Excellent.


>* Example 1: a movie clip with audio description and captions.
>A clip from a movie is published on a Web site. In the clip, a child 
>is trying to lure an alien to the child's bedroom by laying a trail 
>of candy. The child mumbles inaudibly to himself as he lays the 
>trail. When not watching the video, it is not obvious that he is 
>laying a trail of candy since all you hear is the mumbling. The 
>audio description that is interspersed with the child's mumbling 
>says "Charlie lays a piece of candy on each stair leading to his 
>room." The caption that appears as he mumbles is, "[inaudible 
>mumbling]."

You might as well go see the current rerelease of _E.T._ with 
captions and descriptions and transcribe them accurately. Let's not 
kid ourselves as to what's being documented here.


>An animation shows a clown slipping on a banana and falling down. 
>There is no audio track for this animation. No captions or audio 
>description are required. Instead, provide a text equivalent as 
>described in checkpoint 1.1.

Actually, one would write [No audio throughout] or something similar. 
Silent films can be and are captioned. I've watched them myself.


>Content is the information or meaning and function.

An incomprehensible definition.

Graphic design can be content. <http://www.contenu.nu/article.htm?id=1040 >

Content must be given a more ecumenical definition than this.


>You will have successfully keyboard access to all functionality of 
>the content if:
>1. all functions of the content can be operated from a standard 
>keyboard without requiring simultaneous activation of multiple 
>character keys.

Inadvisable, otherwise accesskey goes out the window. Among the many 
failings of accesskey is that a, A, ä, Å, and other variants of the 
same underlying character are all *technically* different accesskeys. 
Only iCab even attempts to honour this distinction, but in any event, 
accesskey="A" is completely legal at present but would be illegal 
under this plan because you have to press Shift.


>Individuals who are blind (and cannot use pointing devices) can have 
>access to the functionality of the product.

Keyboard arrow keys are equivalent to a pointing device. This is not 
how screen-reader users interpret Web pages.


>Individuals with severe physical disabilities can use speech input 
>(coupled with a keyboard emulator which injects the spoken text or 
>commands as if they were typed on the keyboard) to enter data or 
>control the device.

Don't forget "switch access," i.e., controlling a keyboard with a 
single switch or a very small number of them.


>Real-time events
>
>Competitive activity: an activity where timing is an essential part 
>of the design of the activity. Removal of the time element would 
>change the performance of the participants. None time based 
>activities might be preferred and may be required for some venues 
>but this would require a complete redesign of the activity or test 
>and would therefore fall under guidelines

It must be pointed out that time-based activities (it's hyphenated, 
BTW) are often altered for disabled people here in the real world. It 
is quite possible to take a two-hour test in, say, four hours if your 
disability requires the extra time.

We should at least mention the idea of accommodating the disabled 
user by increasing the time allotment. That would require a higher 
level of disability planning and awareness than we usually find, of 
course.


>  * Example 2: a news site that is updated regularly.
>
>A new site causes its front page to be updated every 1/2 hour. The 
>front page contains minimal text and primarily consists of links to 
>content. A user who does not wish the page to update selects a 
>checkbox.

Except that refreshing every 30 minutes is intrinsically accessible. 
According to the guideline, if 30 minutes is the "average" refresh 
time, then the site should suggest 300 minutes (five full hours) as 
the refresh time for the disabled visitor. Twice-an-hour refreshing 
is virtually unnoticeable. I would think a better example is, say, 
ChickClick affiliate sites that load ads in a frame and cycle through 
them every couple of minutes.


>2. if flicker is unavoidable - user is warned of the flicker before 
>they go to the page , and a version of the content without flicker 
>is provided.

If you can provide an unflickering page, then the flicker is not "unavoidable."


>This one has always generated complaints because authors don't know 
>how to measure the flicker rates. Can we suggest how they would know 
>whether their flicker falls in this range?

GIF animation and Flash animations need a measuring stick here. Also, 
many browsers let you turn off GIF animation altogether. Isn't that 
part of the User Agent Accessibility Guidelines? At a certain point, 
users have to take responsibility for their own equipment.


>NOTE: It is very difficult to determine what makes writing clear and 
>simple for all topics. Some content is derived from other sources 
>and is copyrighted so it cannot be altered. Some materials or topics 
>cannot be communicated accurately in simple language. Also, since 
>some people cannot understand the content no matter how simply it is 
>written, it is not possible to make any content accessible to 
>everyone. Specific objective criteria that could be applied across 
>all types of content are therefore not possible.

Yes!

>Checkpoint 4.2 [3.4] Supplement text with non-text content.
>
>NOTE: Supplementing text with non-text (e.g. graphics, sound, smell, 
>etc) is useful for all users. However there are no clear guidelines 
>as it relates to disability. Specific objective criteria that could 
>be applied across all types of content are therefore not possible.

Yes!

>1. Provide a definition (with the first occurance)

Occurrence, you mean?

>  of phrases, words, acronyms, and abbreviations specific to a 
>particular community.

A debased term, yes? How about "readership" or "audience"?

The problem with defining only the first use is that many people Join 
Us Late: They end up in the middle of the page. Besides, we have 
search-and-replace on our computers now (though apparently not on WAI 
computers during the WCAG 1.0 era) and can easily change all 
occurrEnces of an abbreviation or acronym in one fell swoop.

>  2. Provide a summary for relationships that may not be obvious from 
>analyzing the structure of the table but that may be apparent in a 
>visual rendering of the table.

We make tables because their visual rendering *is* the structure and 
clarifies itself. The problem is that tables are hard to understand 
*without* visual rendering. That's why accessibility advocates have 
been going off in (counterproductive, meaningless, outdated) high 
dudgeon against tables for years.

Can we not rewrite reality here, please?

>Definitions (informative)
>
>Content is considered complex if the relationships between pieces of 
>information are not easy to figure out.

You're pretty much defining the term with itself here ("complex" and 
"not easy to figure out"). Also, easy for whom, and who says?

>  If the presentation of the information is intended to highlight 
>trends or relationships between concepts, these should be explicitly 
>stated in the summary.

No, that's why we draw graphs.

I want WAI to get over its antipictorial bias once and for all. 
Words, though they are a primitive of the Web, are not better than 
pictures *in every case*; the converse is also true. WAI authors 
would perhaps like to eliminate graphics except inasmuch as they 
would also like to force everyone on the planet to use them.


>Examples of complex information:
>* data tables, [ ALWAYS?]
>* concepts that are esoteric or difficult to understand,
>* content that involves several layers.
>
>Content might be unfamiliar if you are using terms specific to a 
>particular community. For example, many of the terms used in this 
>document are specific to the disability community.

Community, community, community. In any event, some concepts *are* 
esoteric or difficult to understand, except for those in the audience 
(the readership) who find them workaday and straightforward. A lot of 
people find accessibility esoteric and difficult to understand, but 
we on this list find it pretty easy.

There's making the Web easier for people with learning disabilities 
to use and there's the reality that not everyone understands every 
topic, or *could ever* understand every topic. The requirement as 
written sounds like tall-poppy syndrome: Intelligent writers and 
experts are actively penalized for expressing their intelligence and 
expertise for an audience that can handle them.


>Summarizing information that is difficult to understand helps people 
>who do not read well.

Then how can they understand the summary?


>  Defining key terms and specialized language will help people who 
>are not familiar with the topic. Providing the expansion of 
>abbreviations and acronyms not only helps people who are not 
>familiar with the abbreviation or acronym but can clarify which 
>meaning of an abbreviation or acronym is appropriate to use. For 
>example, the acronym "ADA" stands for both the American with 
>Disabilities Act


That's American*s*, BTW.

>Checkpoint 5.1 [4.2] Use technologies according to specification.
>
>Issue: should there be a qualification or exception for 
>backward-compatibility? If so, under what circumstances should it 
>apply? Alternatively, if an implementor decides to use invalid 
>markup for backward-compatibility reasons, shouldn't they be 
>"honest" and indicate that they haven't satisfied this checkpoint?

<embed> vs. <object> is the canonical example here.


>Checkpoint 5.2 [4.4] Ensure that content remains usable when 
>technologies that modify default user agent processing or behavior 
>are turned off or not supported.

Well, what about PDF and Flash, which are variously accessible but 
whose files not necessarily loaded and displayed by default?

It is not wrong to publish in PDF or Flash, at least from an 
accessibility perspective (with attendant provisos about what PDF and 
Flash can and cannot do). This guideline would essentially forbid 
people from publishing even fully-accessible tagged PDFs and 
annotated Flash movies.


>In determining the extent to which older technologies should be 
>supported, keep in mind that
>* assistive hardware and software are often slow to adapt to 
>technical advances.

That makes it sound like "assistive hardware and software" are 
people. In reality, it took until version 4.01 of Jaws for simple 
HTML accessibility tags like longdesc to be supported. It is not the 
content author's problem that third-party vendors are too slow to 
update their products.


>* for significant groups of users, it may not be possible to obtain 
>the latest software or the hardware required to operate it.

That isn't a disability issue. Please don't nag about the fact that 
disabled people are poor; we know they often are. But time marches 
on. Authors should be able to use accessible authoring methods 
(longdesc, for example, or PDF or Flash), and if a person's 
out-of-date equipment cannot read the annotations, well, them's the 
breaks. WordPerfect 5.1 for DOS can't read a Web page, either.



In general, a very strong reformulation. It can only get better. But 
what happened to Kynn's modularization idea?
-- 

     Joe Clark | joeclark@joeclark.org
     Accessibility <http://joeclark.org/access/>
     Weblogs and articles <http://joeclark.org/weblogs/>
     <http://joeclark.org/writing/> | <http://fawny.org/>

Received on Monday, 25 March 2002 14:26:03 UTC