Re: some reflections on @alt usage from Ian Hickson on 2008-08-20 (wai-xtech@w3.org from August 2008)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 20 Aug 2008 22:45:05 +0000 (UTC)
To: Al Gilman <Alfred.S.Gilman@ieee.org>
Cc: W3C WAI-XTECH <wai-xtech@w3.org>
Message-ID: <Pine.LNX.4.62.0808201812240.19930@hixie.dreamhostps.com>
On Wed, 20 Aug 2008, Al Gilman wrote:
> On 19 Aug 2008, at 5:35 PM, Ian Hickson wrote:
> > On Tue, 19 Aug 2008, Al Gilman wrote
> > 
> > > In the Rorschach test WCAG spells out what you should provide as a 
> > > text alternative.
> > 
> > Could you elaborate? I haven't been able to find where in the WCAG 
> > documents it is made clear what could be said that would actually help 
> > accessibility here.
>
># [...] text alternatives at least provide descriptive identification of 
># the non-text content.

There are two problems with this. One is that the descriptive 
identification of the non-text content would almost certainly be provided 
anyway, e.g. as the image caption, for all users, and thus including it in 
the alt="" attribute would be redundant, leading to "stuttering" (that is, 
content repetition). What does WCAG say the alternative text should be 
when the image is a key image, such as a Rorschach test, but the 
descriptive identitification is already provided elsewhere?

Secondly, a descriptive identification of an image is _not_ a textual 
replacement of the image.

The following:

   <p>What do you see in the following image?</p>
   <p><img src=test.jpeg alt="A colour blindness test."></p>

...is semantically equivalent, for users without images, to the following:

   <p>What do you see in the following image?</p>
   <p>A colour blindness test.</p>

Which is not at all equivalent. The point of the image here is the image, 
the text "A colour blindness test" is not a replacement, it's not 
equivalent, and treating it as such makes the page highly confusing and 
even more inaccessible to users than necessary.

In cases like these we need a mechanism that explicitly says "I know this 
is an image, don't replace the image with some text and pretend the image 
wasn't there".


> > > > > 2. Don't say that this markup advice is for *important* images 
> > > > > where you don't know what to provide as a text alternate.  The 
> > > > > 'important' restriction is not appropriate.  The same markup 
> > > > > approach should apply for unimportant images where you don't 
> > > > > know that they are unimportant.
> > > > 
> > > > Could you provide an example? When would there be an unimportant 
> > > > image for which alternative text is required (i.e. it's not 
> > > > decorative) and for which the alternative text isn't available?
> > > 
> > > In the batch-upload scenario, the site wrapping the uploaded photos 
> > > doesn't know which files are key moments from the vacation and which 
> > > are useless blurs containing only a fuzzy swipe of the user's foot.  
> > > You don't know it's important until you know what is in the image 
> > > and what it contributes to the story it is embedded in.
> > 
> > In the context of an HTML document, though, those pages are 
> > objectively important regardless of whether they may be unimportant in 
> > the context of the wider photoset. They are the "content" of the page, 
> > which is important by definition.
> 
> No.  There is no objective importance.  Importance is for the user to 
> revise as they browse the page.

I disagree. Regardless of whether the photo is in focus or not, anyone 
going to a photo's flickr page is always intending to look at that page's 
photo. There's never a case where Flickr would remove the photo from its 
own photo page.


> > > > > if we are going to try to address this as a common case, 
> > > > > unknown-to-be-decorative images should be included and not just 
> > > > > other unknown-what-to-say images thought to be 'important.'
> > > > 
> > > > How can the author not know if the image is decorative or not?
> > > 
> > > See discussion of photo-upload scenario above.
> > 
> > None of the photos are decorative in that example, and you know this 
> > ahead of time, before the photos have even been uploaded.
> 
> No, the photos were bulk uploaded.  They are all the images the camera 
> retained.  You don't know how important they are because the person who 
> shot them has not reviewed them before they went into the online system.

You may not know how important the user considers them, but you certainly 
_do_ know that in the context of the Flickr page, none of them are 
decorative.


> When they haven't looked at the image, only pressed the shutter button 
> and uploaded the camera contents after three weeks of pressing the 
> shutter button.  It is the bulk-upload case.  It's your use case.
> 
> The 'not possible' case has been justified on the basis of "we have to 
> tell the site pasting the user's undocumented image into a pro-forma 
> page what to do with the attribute."  This is what I mean by you are 
> splitting the author.  The original author is the vacationer pressing 
> the shutter button.  But they don't look at the image or add text to it.  
> The site puts it into some HTML pages.
> 
> Do these have to be conforming?  To what?

Conformance is how we say what is allowed and what isn't. We want to allow 
people to upload their images to the Web without looking at them. 
Therefore it has to be possible to mark up such a page in a conforming 
manner. That's just what conformance is.


> > > > > And make it clear that the "human didn't bother" case is 
> > > > > included.
> > > > 
> > > > According to HTML5, if the human didn't bother, the page isn't 
> > > > compliant.
> > > 
> > > This is statement at variance with the attempt to cover the photo 
> > > upload case. I don't agree with this interpretation of the draft as 
> > > posted.
> > 
> > The spec says:
> > 
> > # When it is possible for alternative text to be provided, for example if
> > # the image is part of a series of screenshots in a magazine review, or
> > # part of a comic strip, or is a photograph in a blog entry about that
> > # photograph, text that conveys can serve as a substitute for the image
> > # must be given as the contents of the alt attribute.
> >  -- http://www.whatwg.org/specs/web-apps/current-work/#a-key
> > 
> > That seems pretty cut and dry to me.
> 
> It has a gratuitous "when it is possible..."  So far all the 'not 
> possible' examples offered are either contradicted by what WCAG says 
> should be in the text alternative, or a closet 'didn't bother' as when 
> you say "We have to tell the site how to format it when the uploading 
> user didn't provide anything."
>
> Don't get me wrong.  My personal opinion is that we should still look at 
> what is the best non-conforming-to-WCAG thing to do in repair cases such 
> as the bulk upload.  And there is a live question as to whether this 
> belongs in HTML specification or WCAG techniques.  But we should take it 
> on.  However with a note that this is a "close, but no cigar" situation 
> as far as meeting accessibility standards.

HTML5 will define how authors are to mark up their HTML5 pages. We're not 
talking about how to conform to accessibility standards, it's a given that 
if you haven't described the photo textually then there is no way for a 
text-only user to get an "accessible" experience. We're talking about what 
HTML5 should require of such authors.


So far there have been three proposals for handling these cases:

 * Saying that the alt="" attribute being omitted is that indication. This 
   causes problems because so many pages omit the alt="" attribute on many 
   unimportant images that user agents will likely want to use heuristics 
   on those images instead of always honouring the new semantic.

 * Saying that special syntax in the alt="" attribute is the indication. 
   This has the problem that in some cases, people will use the syntax 
   unintentionally (e.g. with the {} proposal and latex alt text.)

 * Introducing a new attribute just for this. This has a number of 
   problems: introducing attributes for such rare use cases isn't good 
   language design; the attribute would almost certainly be copied around 
   unintentionally by authors leading to it being at least as unreliable 
   as the special syntax if not more; it introduces a whole class of 
   extra conformance errors and complications, such as what to do when it 
   is used with or without the alt="" attribute; and, possibly most 
   important, nobody has come up with an obviously clear and unambiguous 
   name for this attribute.

I'm leaning towards going back to the first of these.


> > Could you cite the part of the spec that says that "not bothering" is 
> > a valid reason to not include alternative text? If there is anything 
> > that can be read that way, it should be fixed.
> 
> "When it is possible..."

Whether it is possible to do something is unrelated to whether the site 
author could be bothered to do it.


> > > > > In particular, most accessibility experts will not agree that 
> > > > > the photo upload use case is one where the authoring tool could 
> > > > > not come up with something that is better than nothing.
> > > > 
> > > > While this is clearly true (people calling themselves 
> > > > accessibility experts have stated they do not agree that a site 
> > > > accepting uploaded photos may not know what the image represents), 
> > > > I do not intend to pander to accessibility experts. My goal is to 
> > > > make the spec actually improve accessibility.
> > > 
> > > in your own judgement, it would sound like. Your judgement in these 
> > > matters would be more accurate if you listened more attentively to 
> > > the institution of the WAI.
> > 
> > With all due respect, I would rather base my judgements on verifiable 
> > research and on logical arguments than on blanket assertions that seem 
> > counter-intuitive.
> 
> Well, the assertion that the un-screened image is 'objectively important 
> in the context of HTML' while in fact its importance is unknown in human 
> terms is more 'blanket assertion that seems counter-intuitive' than 
> 'verifiable research' or 'logical argument.'

It seems pretty logical to me that in the context of a page designed to 
showcase an image, the image in question is important.


> > > You seem to be assuming that the use can associate this information 
> > > correctly with the image.  The world in which the speech-only user 
> > > experiences a web page is smaller than that.  If the relationship of 
> > > the "also on the page" text to the image is machine-interpretable, 
> > > as for example 'legend' on 'figure' then the AT can help the user 
> > > make the association.  In the absense of such a formal relationship, 
> > > the redundancy can be more positive than negative.
> > 
> > This is so diametrically opposed to my own experiences that I would 
> > need significantly more evidence of this to be convinced of it. Do you 
> > have any research you could show to demonstrate this remarkable 
> > assertion?
> 
> This sounds as though your intuition is conditioned by your own 
> experience, and not from listening to blind or dyslexic users, or the 
> teachers of severly learning disabled people.
> 
> Ask Joshue <joshue.oconnor@cfit.ie>.  He has spent far more time with 
> PWD using computers than you or I.

Joshue has provided extremely useful usability studies that we have used 
to help guide the design of HTML5. However, with all due respect to 
Joshue, sometimes even his opinions contradict the evidence he provides. 
That is why it is more important to base our decisions on actual objective 
research. (I say this without meaning any ill will to Joshue at all, I am 
very thankful for all the research he has done for us so far.)

Do you have or know of any research you could show to demonstrate this 
remarkable assertion that duplicating information improves accessibility? 
Asking me to speak to other experts is still an argument based on 
authority and not an argument based on research. Surely for such a major 
finding, there has been extensive documented research.


> > > Something that "appears somewhere else on the page" doesn't meet the 
> > > technical requirements for a text alternative, as the user's working 
> > > memory of what is on the page is limited.
> > 
> > I think this dramatically undersells the user. If a page contains a 
> > title, and a comment, and an undescribed image categorised as a photo, 
> > users would have no trouble associating the description comments and 
> > the title with the photo.
> 
> Yes, but the rule is not limited to such simple pages.
> 
> It would apply to an image that appears in a typically cluttered start 
> page.  And it is for that latter worst case that we should frame the 
> rule.

Could you give an example of such a page where one would have an image 
that cannot be described when the page is written but where the image 
nonetheless has enough associated data that the user would get confused?

The typical Flickr page is a good counterexample, since everything on the 
page is basically about this one image. Similarly, a start page with a 
Webcam would be a counterexample because it is highly unlikely that the 
Webcam image will have much metadata in parts of the page far away from 
the image itself.

A concrete example would help demonstrate this case.


> > > My personal preference would be to get that info in the DOM in formally
> > > specified relationship to the image, and then see what makes sense to
> > > put with the image itself that is terse, and serves two functions:
> > > inform the user about the image and distinguish this image from others.
> > 
> > Merely having the image on the page links the information to the image.
> > Why is this not sufficient? I do not understand what user interface you
> > are imagining that would make this more usable than what we have now.
> > Could you elaborate? Why would a sighted user have less trouble
> > associating disparate information in a Web page with an image on the Web
> > page than a user without image support?
> 
> [snip description of screen readers with which I am familiar]

This doesn't answer the question. Extrapolating from what you refer to, I 
assume you are proposing that one could, while an image has been selected, 
press some key and have the screen reader offer links to other parts of 
the page, or even just read other parts of the page. But it's not clear 
that this is going to help. In the Flickr page example, for instance, the 
user already knows full well that the page is just a photo's page, and 
that all the information on the page is about the photo. He doesn't have 
to find the image to ask the screen reader to read the rest of the page.

Indeed, I would posit that in any case where an image represents something 
for which there is no suitable replacement text but for which there is 
additional information available elsewhere on the page, the image will be 
the only such image on the page, and the page will be primarily about that 
image. Are there cases that contradict this?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 20 August 2008 22:45:30 UTC