alt and authoring practices

Anne van Kesteren wrote:

> Given that authors make mistakes there are nine
> possibilities of authoring images:

>   1. <img alt="..."> - available -> Correct usage
>   2. <img alt=""> - available -> Incorrect usage
>   3. <img> - available -> Incorrect usage
>   4. <img alt="..."> - missing -> Incorrect usage
>   5. <img alt=""> - missing -> Incorrect usage
>   6. <img> - missing -> Correct usage
>   7. <img alt="..."> - empty -> Incorrect usage
>   8. <img alt=""> - empty -> Correct usage
>   9. <img> - empty -> Incorrect usage

> It seems your assumption is that on average 9
> is more common than 3 and 6 ... It seems the
> assumption from the editor is that on average
> all incorrect  usage is about as likely

We know they aren't equally likely, and it wouldn't matter very much
if they were; we also have to consider the cost of being wrong.

One of the baseline assumptions is that cases 4-6 (image important,
but no alt available) should be quite rare, at least in content that
cares about validity.  In practice, they aren't as rare as they should
be.

I don't have exact statistics on the likelihood of an image being
decorative, but it is certainly true of the vast majority of images on
most pages that I have checked.  (It is less true of large images near
the center of the page.)

Going in the other direction, it is easy to get a bogus "" or a bogus
missing attribute, by just not doing anything.  It is much harder to
get a bogus filled-in attribute.  One of the biggest arguments against
magic tokens is that some tools might make those tokens the default,
so that they would show up when they shouldn't.  (And legacy tools
would still need to learn about the magic values, so there would be
cost regardless of whether or not there was benefit.)

So to recategorize your nine cases:

               real alt  alt=""    omitted alt
information      1         2         3
not available    4         5         6
decorative       7         8         9

Case 1 is proper alt usage.  It isn't as common as we would like, but
it is the main goal of alt.

Cases 2 and 3 (image important, but alt omitted or empty) are
unfortunately still pretty common.  And since missing the data here
matters, AT goes to heroic lengths to recover.  Unless it thinks the
image was actually a (correct) case 6 or 8...

Using an explicit _notsupplied as the authoring tool's default would
let the tool say "not my fault", and tell users that it isn't a
(correct and common) case 8.

Case 4 is bad, but unlikely to happen by chance.

When it happens because of cut-and-paste, there really isn't anything
we could do to prevent it.

When it happens because of a misguided tool, then allowing an explicit
_notsupplied (so the tool can say "not my fault") may help prevent the
problem.

Case 5 is also rare only because the lack of information is rare.  But
note that the wrongness comes from tools defaulting to alt="" so they
output valid (if wrong) HTML.  The current draft suggests omitting the
alt; it would probably be more useful (in the long run) to let tools
default to _notsupplied, so that the images can be flagged for
attention (instead of falsely assuming case 8.)

>   6. <img> - missing -> Correct usage

(Though not correct under the HTML 4 spec.)

This is sufficiently rare that, in practice, AT tools rightly assume
cases 3 or 9, and then have to guess which error was made (and the
costs of guessing wrong) when deciding how heroic to be about finding
a replacement on their own.

Case 7 is similar to case 4, but a bit more likely in practice --
authoring tools may insert alt text as part of the skins, which can
lead to doubled words.  Having an explicit _decorative option would at
least reduce the temptation to write alt text like "red bullet".

Case 8 is correct -- but looks exactly like case 2.  Case 8 is more
common, but case 2 is more important.  Allowing _decorative would make
it explicit that this is case 8.

Case 9 is wrong, but defensible.  If there isn't information, why
supply an alt?  alt=_decorative is explicit enough to be worth
writing.

-jJ

Received on Wednesday, 16 April 2008 20:25:46 UTC