Re: Alt verbosity [was: Example canvas element use - accessibility concerns] from David Poehlman on 2009-04-29 (wai-xtech@w3.org from April 2009)

From: David Poehlman <david.poehlman@handsontechnologeyes.com>
Date: Wed, 29 Apr 2009 08:48:27 -0400
To: "John Foliot - WATS.ca" <foliot@wats.ca>
Cc: "'Benjamin Hawkes-Lewis'" <bhawkeslewis@googlemail.com>, "'W3C WAI-XTECH'" <wai-xtech@w3.org>, "'Ian Hickson'" <ian@hixie.ch>
Message-Id: <FBA06774-C9B2-4361-8ABA-AF145E7C48E1@handsontechnologeyes.com>
I have been hanging back here but I need to strongly state that alt is  
replacement not description or was originally intended to be so.  This  
is why we have longdesc and other mechanisms.  I need a short meaning  
for an image and where appropriate as a blind person blind from birth  
all the info I can get about an image.  Making general statements  
about a particular target population is dangerous if not falacious.  I  
know there are a lot of folk out there who would say that "photo of  
jimmy" is fine but part of that is because that is what they have come  
to expect from what is out there in the wild.  What users really need  
is consistancy and clear distinction between description and  
replacement something which despite best efforts on behalf of many,  
continues to be blurred.  In other words, "the whitehouse" is not only  
sufficient as alt, it is exact replacement for the image as an image.   
When someone looks at the pho, they think whitehouse and that is  
exactly what we need to convey.  if they look at it for more detail,  
they begin to *describe* it in their minds.

If we are going to move forward in any meaningful way in solving the  
alt vs description, we must stand on the fact that alt is replacement  
tooltip not withstanding.  We have alt, title, caption and some  
description mechanism we can use.  It's appropriate then that alt be  
the eye catcher meaning that I know what the image means from grocking  
the alt.  I also want the ability to hearl the other data on the image  
at my choosing and to configure whether that is automatically or at  
the press of a key.

I know this is not the real topic of this discussion but it seems to  
have drifted in that direction so felt that it was important to  
comment here based on my and many others experiences and attempts to  
put this across for all these years.

For research on the topic of alt, see:
http://htmlhelp.com/feature/art3.htmOn Apr 29, 2009, at 2:04 AM, John  
Foliot - WATS.ca wrote:

Benjamin Hawkes-Lewis wrote:
>> John Foliot wrote:
>
>> Specifically I point to your
> verbose
>> @alt values, both in this example as well as examples in the current
> draft
>> specification.  However countless others have noted on numerous
> occasions
>> that when it comes to images, many (most? all?) non-sighted users
> wish to
>> choose between a verbose or terse description, similar to 'glancing
> at an
>> image' vs. 'studying an image'.  This is an important distinction
> which
>> seems to be lost here.

Whether or not the survey is an accurate barometer of the question at  
hand I
will concede, although it is *some* data, and is more than is being  
offered
for the counter argument for verbose @alt text. However the main point  
I am
attempting to make here is that often an overly verbose @alt value is
detrimental to the end user.  Consider:

...alt="The White House"
...alt="Photo of the White House"
...alt="A Black and White photo of The White House in Washington DC  
taken by
Ian Hickson on February 31st, 2010. It shows the West Wing of the White
House taken at approximately 7:30 PM and you can clearly see the lights
being left on in the various rooms."

Now, in some instances, those particular details might be important to  
some
users.  Being a sighted user, it will likely be evident to you that some
lights are being left on.  You might take that information in 'en- 
passant'
and process that cognitive bit as significant, or not.  You have the
'luxury' of deciding whether to study that photo for every nuance and  
detail
(which rooms have lights left on?  Who's rooms might they be? How many  
rooms
are lit versus how many are not, etc.) You as a user (sighted or not)  
may or
may not care about the date and time of the photos creation: should it  
then
be part of the @alt text? (As an aside, there has been some collateral
discussion around including EXIF data as part of the 'description'  
data of
photos, and how that might be exposed and processed by user agents -  
it's an
interesting and useful idea IMHO)

What I am suggesting however, is that on first pass, this amount of  
detail,
while certainly useful (and oh that I wish was always made available  
to the
non-sighted) is likely too much.  I am asserting that alt="Photo of the
White House" is likely enough for the majority of non-sighted users  
most of
the time, and I have asked informally of others reading this thread to
comment - so far none have taken the opportunity to do so (sadly).

>> While not 'conclusive', WebAIM's survey results (and interpretations)
> of
>> screen reader users state:
>> "The tendency toward the briefer alternative [text] also increased
> slightly
>> with screen reader proficiency"
>> [http://webaim.org/projects/screenreadersurvey/#images]
>>
>> Given those findings, I will suggest my 'opinion' is based upon more
> than my
>> personal preference, and seems to directly contradict your opinion.
>
> John, I doubt the WebAIM survey results are worth citing as evidence  
> on
> this particular topic.
>


If the WebAIM survey is neither precise enough, or broad enough to drive
home this conclusion, then perhaps we need another survey.  If this is  
the
case, then either my friends at Utah State might run another survey,  
or I
can sponsor one here at my work.  What other specific questions do we  
need
to be asking? (I have some ideas, but am quite willing to open this up  
to
other suggestions - in fact value them).  Thing is, it appears that hard
data is hard to come by, which again leaves us with the even thornier
problem of having the HTML5's 'opinion' guide the specification,  
rather than
informed decision making.


>
> I doubt there's an advantage for short text that leaves out useful
> detail when you're just reading through a page lineally. Why summarize
> an "img" element but not a "p" element?

Really?  How do you arrive at this conclusion?  Are you a daily user of
screen reading technology?  Have you asked daily users how much detail  
they
want for every image?  Do you not think that having a choice between a  
brief
description and a detailed description is a better offering than  
imposing a
long-winded description every time?  If you answer yes, why?

Leaving aside the "thorny question" of the value of WebAIM's survey  
results
and their validity, what research do you or the HTML5 working group have
that establishes the certainty that a verbose @alt value is a better
solution?


> Most of the time, you're going
> to need to expand both to understand the page, in which case you've
> just given the user more to process.

This is a bold statement.  Most of the time the user will need to  
expand an
image's description to better understand it?  Please give a concrete
example: I provided 3 examples at the start of this response, 2 short  
and
one verbose.  I wonder aloud how many non-sighted users would need an
expansion on "Photo of the White House"?

I am not disagreeing that there should not be a mechanism to expand  
upon a
visual assets initial description - hell I'm still arguing for  
@longdesc -
but to presume that all non-sighted users should drink from the Fire  
hose is
(Mr. Hickson) what is frankly very rude - it presumes that you as the  
author
knows better how and what the end user will be doing with your imagery -
that all users are a monolithic mass.  It's a 'we know best' attitude  
that
prevails many of the accessibility pronouncements from the WHAT working
group.

There has been so much electronic ink spilled over @alt in the past 18+
months that finding previous references and quotes is a monumental task,
however I do recall that in at least one thread, a now blind user (was  
it
Leonie from Nomensa?) commented that users who gradually became blind  
valued
a fuller contextual explanation of the image than those users who have  
never
seen - that it helped the first group better process information based  
upon
their past experience.  However, it was also stated that blind-from- 
birth
users did not value the same depth of detail, as 'visualization' for  
them is
a very different concept.  I again reference the difference between  
glancing
at a photo versus studying a photo - they both involve processing visual
information, but under most circumstances sighted users will glance at  
an
image before deciding to further study an image.  If this is indeed a
truism, then why would a non-sighted user not want to process data in a
similar fashion?


>
> However, I can imagine moving from "img" to "img" trying to find the
> image you want, and that short titles for those images could be useful
> so you didn't need to listen to a full alternative.

A good use case, yes.

> But is "alt" -
> intended for alternatives - really the best attribute to use for such
> short titles? Might the "title" attribute be a better way to provide
> such them, or "aria-label" and "aria-labelledby" from ARIA?

Well, let's return to paved cow-paths shall we?  Most images on the web
today have brief descriptors for @alt, and so both users and authors  
have
become accustomed to using @alt in this fashion. Sadly, there are  
probably
too many documents out there where @title and @alt are the same value  
(if
for no other reason than to ensure "tooltips" in the major browsers),  
that
making the use of @title alone in the fashion you suggest a missed
opportunity (so let's learn from past mistakes shall we?).

@longdesc, while maligned and abused (or simply under-appreciated) is  
being
contemplated for dismissal, it is in fact a good mechanism for the  
described
action/use-case.  ARIA's "described-by" is for the most part a re- 
creation
of the feature-set, but if a fresh coat of paint and a new name gets  
things
on track, hell call it a "thingamabob-arama" and just start using it.   
But
to try and re-cast @alt as a carrier of more verbose data now is IMHO a
wrong decision, and goes counter to the paved cow-path design pattern  
that
the WHAT WG holds so sacred and I believe causes as much pain as it does
gain.  Is it worth it? (Methinks no)

JF
Received on Wednesday, 29 April 2009 12:49:16 UTC