Re: some reflections on @alt usage (and summary of research so far)

On Thu, 21 Aug 2008, Steven Faulkner wrote:
> > 
> >There are two problems with this. One is that the descriptive 
> >identification of the non-text content would almost certainly be 
> >provided anyway, e.g. as the image caption, for all users, and thus 
> >including it in the alt="" attribute would be redundant, leading to 
> >"stuttering" (that is, content repetition).
> 
> Can you provide the the research to back up the claim that descriptive 
> identification would almost certainly be provided elsewhere?

Sure.

Using Google Sets to come up with some photo upload sites:

   http://labs.google.com/sets?hl=en&q1=flickr&q2=picasa&q3=photobucket&q4=imageshack&q5=&btn=Large+Set

...I visited a bunch of photo pages, by picking a cute image from the 
front of each site's front page:

   Flickr
   http://www.flickr.com/photos/moonflower5/2788138892/

   Smugmug
   http://cmac.smugmug.com/gallery/5363890_2awbk/#334290495_zqEkV

   Photobucket
   http://media.photobucket.com/image/motivational%20posters%20or%20motivation/RukiTemaFan21/Motivational%20Posters/chibi.jpg?o=98

   Picasa Web Albums
   http://picasaweb.google.com/lescagoules/IMELOOS05072006?feat=featured#5228949483608819442

   Zooomr
   http://www.zooomr.com/photos/reveiled/2931890/

   Fotki
   http://public.fotki.com/dwbrant/pets/7-21-2005/img_2890.html

   webshots
   http://outdoors.webshots.com/photo/2849914900101838694PfFJEB

   Share on Ovi
   http://share.ovi.com/media/PangeaDay.film/CHAPLIN.10150

   Fotolog
   http://www.fotolog.com/emilylovesponcho

   glowfoto
   http://www.glowfoto.com/random.php

   zoto
   http://www.zoto.com/site/#USR.vapileh::PAG.detail::c14394d0e212c26fcbd2bba81f1f31d1

In all but one case, the photo had descriptive information elsewhere on 
the page. The exception was glowfoto, where it appears images don't have 
any descriptive information at all other than the user's name (which is on 
the page) and the category or album name (which is not on the page in the 
case of the random.php script, but which appears when you look at photos 
on a per-photo basis instead of randomly).



> The caption for such an image does not necessarily suffice as the text 
> alternative

Indeed, it almost never does.


> much more descriptive information can be provided for disabled users who 
> cannot view the image or have a vision impairment that renders image 
> blurred.

Indeed. That's required by HTML5 wherever possible.


> Who are you to decide that this information is not worthy of inclusion 
> by mandating that the alt must be of the form {}.

The information is worthy of inclusion. It _must_ be included if at all 
possible. The {...} form (or the omission of alt="" altogether, if we go 
back to that) is only allowed in the case where the tool generating the 
HTML simply doesn't have alternative text available (as, for instance, 
in all the cases listed above).


> In the example below the alt provides descriptive information that may 
> be useful to visually impaired users. It provides information that would 
> be obvious to a sighted user but could not be ascertained by a vison 
> impaired user unless they asked someone to describe it to them. That is 
> the point of a textual alternative, to provide information to a disabled 
> user that cannot access the information themselves.
> 
> <figure>
> <img src="/commons/a/a7/Rorschach1.jpg" alt="a black vertically
> symmetrical shape, which contains 4 small white areas and 9 smaller
> black blotches seperated from the main body of the shape.">
> <legend>A black outline of the first of the ten cards
> in the Rorschach inkblot test.</legend>
> </figure>

Is that really what you see when you look at Rorschach1? Personally I see 
a butterfly with moth-eaten wings, and two little claw hands around the 
head. Of course, the whole point of a Rorschach test is that describing it 
supposedly shows insight into your personality, so the very act of writing 
alternative text for a Rorschach inkblot is a bit like collapsing a 
wavefunction.


> > Could you give an example of such a page where one would have an image 
> > that cannot be described when the page is written but where the image 
> > nonetheless has enough associated data that the user would get 
> > confused?
> 
> examples:
> http://www.flickr.com/photos/
> http://www.flickr.com/photos/tags/australia/
> http://www.flickr.com/explore/interesting/7days/

I couldn't find any images on those pages that, per the current HTML5 
text, are allowed to have the alt={...} text. (They are all inside images 
and therefore all require some useable alternative text.)

In fact in all those cases, one wouldn't _need_ to come up with 
descriptive alternative text. On the first for example, the image caption 
or author, or some of the image's tags, works fine as alternative text, 
since the image is merely a thumbnail that leads to the image itself.


> What josh provided was one set of videos, this in way way can be 
> considered a basis for any decisions, the study is in no way 
> 'scientific' or objective

It's a whole heck of a lot more useful than nothing, which is what we have 
backing up the "always require alt even if it means the site has to chose 
between compliance and going out of business" arguments that have been put 
forward here.


> > That is why it is more important to base our decisions on actual 
> > objective research.
> 
> All the decisions appear to be made by you, so the use of 'our' seems 
> incorrect here. where is this 'objective research'??

Well, the tentative decision to use {...} was based on this research:

 * it's objectively verified that there are pages that have images that 
   have no alternative text available where the generators of the HTML 
   are not able to obtain that data. See, for example, all the pages 
   listed at the top of this e-mail.

 * it's objectively verified that requiring alt="" attributes does not 
   lead to image sharing sites requesting alternative text from their 
   users. Evidence: HTML4 requires alt="" attributes, Flickr doesn't 
   require users to enter alternative text.

 * it's objectively verified that such sites are major sites and are an 
   important part of the Web ecosystem. For example, photobucket is #26 on 
   the Alexa top 500, Flickr is 39:
   http://www.alexa.com/site/ds/top_sites?ts_mode=global&lang=none
   (Other ranking sites lead to similar conclusions.)

These three points lead to this decision:

=> We should allow sites to be written that include images from users even 
   if the sites do not have suitable replacement text.

Now, to design the mechanism to allow this, several ideas were considered:

 * Omitting alt="" altogether for these cases.

 * Having a special syntax inside alt="" for these cases.

 * Having a new attribute for these cases.

To decide between these, Philip and I both did studies examining alt="" 
values. Some of the research is described here (sorry for not making it 
available in more convenient forms):

 * http://philip.html5.org/data/alt-in-braces.txt
 * http://damowmow.com/temp/alt-in-braces.txt
 * http://krijnhoetmer.nl/irc-logs/whatwg/20080603#l-65
 * http://krijnhoetmer.nl/irc-logs/whatwg/20080802#l-142
 * http://krijnhoetmer.nl/irc-logs/whatwg/20080803#l-3

It indicates the alt={...} is pretty rare today. It also indicates that 
alt="" being omitted is pretty common today too (despite HTML4 requiring 
it, though that's another story).

The following arguments impacted the decision, though they have nothing 
beyond logic, anecdotal evidence, or conclusions derived from unpublished 
studies to back them up:

 * Introducing attributes for features that are supposed to be an 
   indicator of a problem (lack of alt text, in this case) isn't good 
   language design, as it brings it too much into prominence.

 * An attribute would almost certainly be copied around unintentionally 
   by authors leading to it being at least as unreliable as the special 
   syntax if not more.

 * An attribute introduces a whole class of extra conformance errors and 
   complications, such as what to do when it is used with or without the 
   alt="" attribute.

All the above led to what the spec says today, which is the alt={...} 
idea.

Since the above, however, it has been pointed out that one problem with 
the alt="{...}" idea exists that wasn't previously considered, namely that 
it makes it harder for systems that _are_ trying to automate alternative 
text creation to do so, since they now have to avoid accidentally using 
the {...} syntax. This dramatically lowers the attractiveness of the {...} 
idea, leaving pretty much only omitting alt="" for these cases as a 
least-worse option.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Friday, 22 August 2008 22:45:47 UTC