Re: Investigating the proposed alt attribute recommendations in HTML 5 from Charles McCathieNevile on 2007-09-13 (wai-xtech@w3.org from September 2007)

From: Charles McCathieNevile <chaals@opera.com>
Date: Thu, 13 Sep 2007 20:24:31 +0200
To: "Henri Sivonen" <hsivonen@iki.fi>
Cc: "Steven Faulkner" <faulkner.steve@gmail.com>, HTMLWG <public-html@w3.org>, wai-xtech@w3.org
Message-ID: <op.tyluq5juwxe0ny@widsith.local>
On Tue, 11 Sep 2007 12:21:03 +0200, Henri Sivonen <hsivonen@iki.fi> wrote:

> On Sep 11, 2007, at 12:16, Charles McCathieNevile wrote:
>
>> I think the situation today is similar to that of seven years ago,  
>> where "we" were advocating for alt as a required attribute, but at the  
>> same time insisting, for accessibility reasons, that authoring tools  
>> don't just auto-generate it.
>
> Any HTML spec that makes unattended markup generators non-conforming  
> will be disrespected in one way or another, because people want to write  
> unattended generators.

This is true. I would rather have non-conformant documents that are more  
useful, than conformant documents that are less useful. Unattended  
generation will, I believe, generally lead to awful alternative content.  
Involving a human, while it will sometimes be ignored or lead to bad  
results, will lead to an improvement - the point where you can say taht a  
person claimed something, and hopefully claims that are more right than  
wrong. (We are not solving the problem here - we are talking about the  
relative merits of partial solutions).

>> The problem doesn't really come down to JAWS, but to authoring  
>> tools/environments.
>
> I don't think JAWS can be excused. Reading out the file name without  
> subjecting it to a heuristic readability test first can lead to awful  
> usability as demonstrated by Steven Faulkner's testing. File names  
> aren't intended to be read out loud by AT in the general case. Reading  
> them is a placeholder generation heuristic. Failing to check first that  
> the string to be read consists of something readable such as words in a  
> dictionary or even sequences of letters that are long enough to look  
> like words (as opposed to just punctuation and digits where a-f counts  
> as potential hex digits) is just awful.

Well, it depends what you are comparing it to.

>> Flickr deciding to put null alt, or something that had the image's tags  
>> in it, would be a half step forward.
>
> On the photo page, at least, that would duplicate content.

Null alt would not. Given that the image has tags, and more so with a  
description, and the image's page has a title (which might be a  
meaningless identifier, but you can book mark it all the same) maybe null  
alt is reasonable in this case. Being able to describe it in a way people  
don't have to read would be nice - not everyone would, but sometimes  
people would, and that would be an improvement for users.

>> Flickr, as a photo album, is an odd case - if you get a description  
>> then it generally makes sense not to have more text in an image.
>
> As I understand it, the whole point of keeping mentioning Flickr is that  
> from the point of view of Web development in general, it isn't odd but a  
> showcase Web 2.0 app.

I don't think that's relevant, Flickr is mentioned because unlike most  
images, the image is central to the meaning of a flickr page. There are  
plenty on Web 1.0 tools that do the same thing - and the same issues apply.

> Conformance requirements should be inclusive enough to allow common apps  
> to be written as conforming.

Indeed. But conformance requirements shouldn't be written just so that any  
common app is considered conformant, or it is meaningless since it doesn't  
suggest there is any user value from conformance.

>> But where a CMS or athoring tool is used, it *should* have the ability  
>> to find out something about the image, or draw something only  
>> moderately bad, or propose the descriptions that have been used before.
>
> My understanding is that the fundamental assumption behind leaving the  
> alternative text generation heuristic to the client side is that there  
> is a handful of aural Web clients but a multitude of software generating  
> HTML and it makes more sense to solve the problem a handful of times as  
> opposed to a multitude of times. Moreover, developers of aural browsing  
> software are expected to be in a better position to have the expertise  
> and incentive to develop heuristics that meet the needs of their users.
>
> Steven Faulkner's testing shows that this assumption fails miserably  
> when it comes to the current state of JAWS. The assumption may have to  
> be revised if there's a fundamental reason why JAWS and others will  
> continue to fail to provide better heuristics on the client side to such  
> extent that even hasty non-expert server-side concoctions are likely to  
> be better throughout the expected lifetime of the spec.

Client-side heuristics are likely to be dreadful forever. Hasty non-expert  
server-side concoctions are likely to be just pretty bad most of the time.  
And involving the people creating the content is likely to lead to stuff  
that is patchy - some pretty bad, most not very good, and some really good.

> One way of revising the assumption would be acknowledging that the  
> client side heuristic isn't a point of competition for the clients and  
> writing a de jure algorithm in the spec saving the client developers the  
> trouble of designing one.

I don't think there are good algorithms that can be implemented for  
unattended generation. There are good ways of reusing information that is  
available about a given image, and there are ways of asking authors what  
an image does. (There is a special exception for navigation icons, where  
there are other heuristics that can be hacled together authoring/server  
side to find intelligent things to say).

>> Without active participation from the author, all heuristics are pretty  
>> awful.
>
> But the whole point of allowing alt to be omitted is to cover the case  
> of what tool developers are to do when the author just isn't  
> participating. And insisting that a human should always participate in  
> the generation step of HTML documents just isn't going to work when it  
> is so clear that people want to have unattended software generating  
> stuff.

When humans don't participate, you generally end up with rubbish results  
(for some better or worse version of rubbish). Anything that doesn't  
recognise that is fighting reality. Recognising it won't change it, or  
change the reasons why people do it. But it will at least make it clear  
when you can expect things to be always awful, and when you can expect  
them to rise up to sometimes pretty reasonable.

>> I would expect more authoring systems to have developed the ability for  
>> J. Adminassistant to provide something half-useful when adding an  
>> image, and more authors to realise that they *can* do this easily  
>> enough and just remember.
>
> It may well work for some authoring use cases, but not for all authoring  
> use cases, which means that such an expectation shouldn't be baked into  
> the notion of document conformance.

Of course.

> I upload so many photos to Flickr (5414 photos uploaded in the last 14  
> months), that I don't take the time to type nice titles for my sighted  
> friends to whom I advertise my Flickr sets. I think expecting me to bear  
> the opportunity cost of authoring even more elaborate additional  
> (something that my camera doesn't output) data for people to whom I  
> don't advertise my relatively inane vacation photo sets to is  
> unrealistic, even though I fully agree that it would be great for a  
> blind user who comes across my photo sets to have alternative text for  
> the photos.

Consider the other case - Gregory's album, which, since he can't see it,  
had to be described by others. How do you know, when you get there, if  
there are real descriptions or just snide comments about other stuff? And  
currently, the answer is that you don't until you read the entire thing  
and then guess by context.

The idea is to improve on that, as well as on what can be done with your  
flickr photos, in a way that is most useful as much of the time as  
possible. I suspect that adds up to the minority of the web in 2015 too,  
but 40% is still better than 4%...

cheers

chaals

-- 
   Charles McCathieNevile, Opera Software: Standards Group
   hablo español  -  je parle français  -  jeg lærer norsk
chaals@opera.com   http://snapshot.opera.com - Kestrel (9.5α1)
Received on Thursday, 13 September 2007 18:24:45 UTC