[whatwg] A revamp of the alt="" attribute on <img> elements

There was a lot of feedback regarding alt="" on <img> elements. I've taken 
it all into account and revamped the relevant section of the spec:

   http://www.whatwg.org/specs/web-apps/current-work/multipage/section-embedded.html#the-img

Don't all scream at once; there are some controversial changes in there 
but I think the overall effect improves matters for the imageless. I 
couldn't make everyone happy, so there's bound to be things you disagree 
with. Read on for the reasoning in response to the e-mails that had been 
sent on this issue over the last two and a half years.

On Wed, 5 Jan 2005, Matthew Raymond wrote:
>
> I would like to see the |alt| attribute for elements <img> and <area> 
> changed from a required attribute to an implied one. In other words, 
> markup that looks like this...
> 
> <img src="image.png" alt="">
> 
> ...should look like this...
> 
> <img src="image.png">

The spec as it stands now makes a distinction between alt="" and no alt. 
The former is a decorative image; the latter is an image that is missing 
alternative text (either because it's a server-generated page and the 
server has no idea what the alt text should be, or because the image 
really is a core part of the page and the amount of alt text that would be 
needed to convey the same as the image is far more than the author will be 
willing to write).

I think that's more useful than simply implying alt="". What do you think?


On Thu, 6 Jan 2005, Matthew Thomas wrote:
> 
> For perhaps 95 percent of the images on the Web, the most appropriate 
> alternate text is nothing at all. (In 2003 I did a survey of images in 
> Wikipedia articles, where images aren't even used for decoration, and 
> still found that alt="" would be the most sensible choice for 77 percent 
> of them.) So for that 95 percent, assuming that no alt is alt="" would 
> improve the user experience.
>
> Unfortunately, the other 5 percent would ruin the idea. When 
> screenreaders are wading through inaccessibly-written pages, sometimes 
> images are used for navigation (graphical menus, for example), so the 
> user needs an indication that an image is there (whereupon they can 
> guess its function by its URI). Assuming that all these images had 
> alt="" would make such pages completely unnavigable.

I tried to make the spec convey this in the requirements on alt text.


On Thu, 19 Jan 2006, Matthew Paul Thomas wrote:
>
> For a while I've been honing a clearer definition of the alt= attribute, 
> one that tries to curtail the worst misuses of the attribute without 
> being horribly wordy. Since alt= is not yet defined in the Web 
> Applications 1.0 draft, my text may be useful.

I ended up with a far more wordy version. Let me know if you think it's 
too wordy.

I think verbosity here is something worth trying. We've tried being brief 
before (e.g. HTML4) and it hasn't worked.


> In HTML 4 alt= is an attribute for <img>, <applet>, and <input>. I can think
> of no reason for <input alt= to exist

It's for <input type=image>.


> <p>Since the two representations are alternate, not supplementary, a 
> user agent should render either an image or its alternate text but not 
> both at once. For example, a <abbr>UA</abbr> should not show alternate 
> text in a tooltip. Authors who wish to provide supplementary text for an 
> image may use the <code>title</code> attribute instead.</p>

I've noted that we'll have to say this in the rendering section.


> <p>Specifying alternate text helps readers without graphic display 
> terminals, visually impaired people, others who prefer listening to 
> documents rather than viewing them, people viewing documents offline 
> when an image is not available, and so on. To produce sensible alternate 
> text, authors should follow these guidelines:</p>
> 
> <ul>
> <li><p>Do not describe the image. For example, do not write <code class="bad
> example">&lt;img src="logo.png" alt="ExampleCorp logo"&gt;</code>, or <code
> class="bad example">&lt;img src="logo.png" alt="logo.png (3890
> bytes)"&gt;</code>. Instead, write text that fulfils the image&rsquo;s
> purpose; for example, <code class="good example">&lt;h1&gt;&lt;img
> src="logo.png" alt="Welcome to ExampleCorp"&gt;&lt;h1&gt;</code>. A
> description is appropriate only if the image itself is discussed but not
> elsewhere described in the document. For example: <code class="good
> example">&lt;p&gt;I managed to snap a photo of the animal. &lt;img
> src="animal.jpg" class="photo" alt="It's a bit blurry, but it shows a large
> brown creature running through the forest."&gt; At last, evidence of the
> moa!&lt;/p&gt;</code></p></li>
> 
> <li><p>Do not provide alternate text for an image when it is used for
> formatting, decoration, illustration, or linking to a solely graphical
> resource. Instead, use <code>alt=""</code>. For example, a portrait of someone
> should usually have <code>alt=""</code>, unless either their physical
> appearance or the artwork itself is highly relevant and not described
> elsewhere in the document.</p></li>
> </ul>

The above is basically covered by the new text. Let me know if there's 
anything you think we should add.


On Thu, 19 Jan 2006, Alexey Feldgendler wrote:
> 
> I wonder why alt is a required attribute for IMG in HTML while an empty 
> value is allowed.

The new spec distinguishes between these two cases, as described above.


> There are several arguments for making it optional:
>
> 1. Many authors still don't specify alt or specify alt="" just to make 
> the page validate. There's not much sense in requiring an alt when there 
> is a way to not specify it (alt=""), though it is a spec violation.

Note that although a validator tool might not catch it, giving the wrong 
alt text is now violation of a MUST requirement in the spec.


> 2. Empty attributes aren't very XPath friendly (actually, XPath isn't 
> well equipped to work with empty attributes).

That wasn't a concern. :-)


> 3. If other elements, such as APPLET, also get the alt attribute, it 
> would have to be optional to maintain backward compatibility. It would 
> be inconsistent to require alt for IMG and have it optional for APPLET.

<applet> is gone, in favour of <embed> and <object>. (There doesn't seem 
to be any reason to have an element specific to Java when there's no 
element specific to Silverlight or any of the other proprietary or 
semi-proprietary languages that you can use on the Web.)


On Thu, 19 Jan 2006, Anne van Kesteren wrote:
> 
> Because an empty value means that there is no alternate text and no 
> attribute at all means that alternate text is missing. (Which is clearly 
> not what you want.)

That's basically what we're at now, yes, though the spec now makes 
allowances for having the text be missing so as not to force people to 
include bogus alt text.


On Thu, 19 Jan 2006, Alexey Feldgendler wrote:
>
> The same could be said about title="", for example:
> 
> "An empty value means that there is no title, and no attribute at all 
> means that the title is missing." But HTML doesn't declare the title 
> attribute as required.

title="" is different because title="" means there's no title but the lack 
of an attribute means the parent's title applies.


On Thu, 19 Jan 2006, Alexey Feldgendler wrote:
> 
> The alt attrubute should be made optional, and when it's omitted, the UA 
> should try to obtain some useful information from the file name or by 
> other means.

That's sort of what has now been done.


On Fri, 20 Jan 2006, Matthew Raymond wrote:
>
> I'm not sure I agree. If you look at what you might use <img> for, it's 
> almost always presentational, and could therefore be done with CSS. The 
> more semantic the image, the more necessary alternate content becomes, 
> thus making the |alt| attribute necessary for a truly semantic <img> 
> element. If you find yourself using <img alt=""> a lot, it's probably 
> because you're not making proper use of CSS, or because you're using 
> <img> elements to achieve a presentational effect that is currently not 
> possible with just CSS 2.1 (yet may likely be possible in CSS 3).

Agreed, the spec now suggests using CSS where appropriate.


On Sat, 21 Jan 2006, Anne van Kesteren wrote:
> 
> HTML5 is not about making the world valid.

Indeed! If it was, our job would be much easier.


On Sat, 21 Jan 2006, Matthew Raymond wrote:
> 
> If an <img> element is being used in a "certainly presentational" way, 
> should it not be done away with in favor of CSS?

There are decorative ways to use <img> that are still page-specific 
"content", and thus wouldn't really apply in CSS. The spec gives some 
examples now.


> Hmm... Is <img> ever non-presentational? Radical thought: Deprecate 
> <img>.

Too radical. :-) It's one of the most-used elements!



On Sat, 21 Jan 2006, Matthew Raymond wrote:
> 
> However, is there any situation where <object> couldn't be used instead 
> of <img>?

<object> has a number of legacy implementation issues, <img> is simpler to 
use. I think we want to keep <img> unless we have really good reasons not 
to (which I don't think we do).


On Sat, 21 Jan 2006, Alexey Feldgendler wrote:
> 
> Maybe instead deprecate <img> for presentational images, leaving it only 
> for semantic images (with non-empty alt required).

That's basically what the spec says now.


On Sat, 21 Jan 2006, Matthew Paul Thomas wrote:
>
> On 20 Jan, 2006, at 1:18 AM, Alexey Feldgendler wrote:
> > ...
> > The alt attrubute should be made optional, and when it's omitted, the UA
> > should try to obtain some useful information from the file name or by other
> > means.
> > ...
> 
> Gecko used to do that 
> <https://bugzilla.mozilla.org/show_bug.cgi?id=5764>, but no longer does 
> because it didn't work for the many cases of <img src="spacer.gif"> and 
> the like. (I can't find where the latter decision was made, but IIRC Ian 
> Hickson was the one who made it.)

It wasn't just me, but yeah, basically we were forced to not do it because 
of what it did to actual sites.


On Fri, 20 Jan 2006, Henri Sivonen wrote:
> 
> Suppose there is an authoring tool that has a design goal of always 
> outputting conforming (to the extent conformance is machine-assessable) 
> documents. This tool allows the user to insert images.
> 
> Allowing images to be inserted without prompting for more information 
> and also enforcing the presence of a human-supplied alt attribute would 
> mean that the tool would have to refuse to save the document until the 
> alt texts have been supplied. Refusing to save is not good. Therefore, 
> the tool would have to present a document-modal dialog prompting for the 
> alt text upon inserting the image.
> 
> Sure, some people might even enter some text, but people who just want 
> to get on with it would hit return with an empty text box. 
> Alternatively, the tool makers could give up the requirement of 
> human-supplied alt text and just generate an empty alt text by default 
> without asking. (Considering that the tool itself--not just the author 
> using it--will be judged by seeing if the output passes an automated 
> conformance check, it is likely that the requirement of correct output 
> will not be dropped because of the alt issue.)

As the draft stands now, you might be able to get away with outputting 
a document that a validator can't find problems with, but that won't make 
it conforming.


> The bottom line is that requiring the presence of the alt attribute 
> leads to a situation where UAs cannot tell whether the alt text is empty 
> because the image is purely decorative or because the author did not 
> bother to think about it.

Indeed, this is the reasoning I followed.


> IMO, this leaves the people who don't see the images worse off compared 
> to a scenario where an empty alt text signified a purely decorative 
> image and a missing alt attribute signified that the author did not 
> bother to provide a textual alternative.

Right.


On Sat, 21 Jan 2006, James Graham wrote:
> 
> People seem to have passed this point by. the current specification of 
> alt as required makes it almost impossible to design a conforming HTML 
> editor that doesn't mess up the semantics of the attribute. Since many 
> (the majority?) of HTML pages are produced using some form of graphical 
> editor (often implemented using contentEditable or some other HTML+js 
> solution as part of a CMS), the spec should at least consider the needs 
> of editors as well as UAs.

Agreed.


On Sat, 21 Jan 2006, Jonny Axelsson wrote:
> 
> As I have stated before [1], 'spacer' is arguably the element with the 
> most semantic information (namely that this element is used for layout 
> hacks only and can be ignored for every other purposes), losing 
> information when replaced with <img src="./spacer.png" alt=""> because 
> the UA now doesn't *know* that the image is useless, but can assume so 
> based on factors like URL, image dimensions, content, and above all the 
> specified empty 'alt' attribute. Going from 'img' to 'object' loses more 
> information, to be exact the very 'alt' attribute to separate the useful 
> from the useless.

Spacer GIFs are non-conforming now.


On Sun, 22 Jan 2006, Matthew Paul Thomas wrote:
> 
> I don't think that can achieve anything -- ceteris paribus, a graphical 
> editor's ease of use will be inversely proportional to how well it 
> encourages accessible and semantic use of images, no matter how they're 
> represented in markup. At one end of the scale, you have software where 
> an image is inserted by dragging and dropping, and there is no interface 
> for alt= text at all (such as most graphical mail clients). At the other 
> end, you have a two-paned editor where the top pane shows the normal 
> WYSIroughlyWIG presentation, and the bottom shows the page with no CSS 
> and with editable inline alt text instead of images (nice for a 
> dedicated Web author, but utterly unreasonable for rich-text e-mail or 
> Web applications). In the middle, you have an 'Alt text" field buried in 
> a dialog somewhere, with long-running disputes over how insistent it 
> should be (as in Mozilla Composer).
> 
> For those using text editors, however, there is a way of encouraging 
> suitable fallback content: encourage use of <object> and discourage use 
> of <img>. It is much more obvious that <object></object> should perhaps 
> have something inside it than that <img src="foo"> is missing an alt= 
> attribute. And for those few who read the spec, you can define <img> 
> tongue-in-cheek as "a piece of text with an alternate graphical 
> representation", as Ian has already done; and provide guidance on the 
> use of alt=, as I did at the start of this thread.

Note that the spec no longer defines it as a piece of text, sadly.


> Bizarre but serious conclusion: alt= should be optional for <img> in 
> documents where a <meta name="generator"...> element is present.

This unfortunately is highly unpopular! (I tried it with the WYSIWYG stuff 
for <font style=""> and boy has that been shot down.)


On Mon, 23 Jan 2006, dolphinling wrote:
>
> How about "Authoring tools MUST only provide alternate text that the 
> author explicitly requests, and especially MUST NOT provide alt="" 
> unless the author specifically says that the alternate content is empty. 
> Authoring tools SHOULD make it obvious to the author what the meaning of 
> alt= is, for example with the string "What text should be used if the 
> image cannot be displayed?""

I don't really want to disallow hypothetical artificial intelligence based 
editors! But the current text hopefully encourages this attitude in tool 
developers.


> Problems with this approach include the following: First, it could be 
> interpreted as disallowing pseudo-AI. This could be fixed with a note 
> saying "This should not be interpreted as disallowing pseudo-AI in 
> authoring tools, but even a pseudo-intelligent authoring tool MUST NOT 
> assume an empty alt text."

It's not clear to me what that would mean. It sounds like it would allow 
weasling out. I think the current text is less ambiguous about what is 
allowed and what isn't.


On Wed, 25 Jan 2006, Matthew Paul Thomas wrote:
> >
> > How about "Authoring tools MUST only provide alternate text that the 
> > author explicitly requests,
> 
> That would seem to prevent, for example, Microsoft FrontPage from 
> generating the obvious alt text for an Image Composer image that 
> consisted only of text sprites. (And since Microsoft continue to 
> misimplement the existing spec for alt=, it wouldn't be a good idea to 
> trust them to interpret "explicitly requests" the way you want.)

That too.


> It would be a problem as long as "generates valid HTML" is considered a 
> feature separate from conformance, since software can guarantee the 
> former but not the latter. And I don't think anything in an HTML 5 spec 
> could prevent validity from being seen as a feature. That's why I 
> propose the <meta name="generator"...> exception for compulsory alt=.

Yeah... see above. :-( I liked it too!


On Sat, 21 Jan 2006, Simon Pieters wrote:
> 
> Lynx shows the file name if alt="" is ommitted. IIRC, HTML 4.0 
> previously recommended that UA's should use the file name if alt is 
> ommitted, so to avoid HTML 4.0 compliant UA's using the file name when 
> we want it to be empty, I think it is reasonable to require alt when src 
> is present (and vice versa).

I don't think this would affect Lynx much, since so many pages already are 
lacking alt text anyway.


> Using the file name when the alt attribute is ommitted might make sense 
> in some cases, such as:
> 
>   <a href="/"><img src="home.gif"></a>
> 
> ...but not in others, such as:
> 
>   <img src="spacer.gif">

>From my experience I recommend against using the filename.


On Sat, 4 Nov 2006, Alexey Feldgendler wrote:
> 
> <img> is somewhat broken in any case. If I was making it up from 
> scratch, I would treat missing alt same as alt="" and define it to mean 
> "semantically valuable image for which the author did not provide an 
> alternative text". For purely decorative images, if such thing is to 
> exist at all, I would define a separate attribute like "decorative", so 
> that semantic images surely don't end up as decorative by mistake.

That's an interesting idea, though I'd expect people to not bother with 
the attribute since it wouldn't really do anything.


On Wed, 11 Apr 2007, Kristof Zelechovski wrote:
>
> I think the correct fallback for a photograph for its own sake is alt="(Use
> a browser that supports graphic images to view)".

That doesn't help, e.g., blind users, or users who are driving a car and 
can't take their eyes off the road.


> The problem is that such images usually have an independent caption that 
> is visible with alongside the image.  Specifying a description of the 
> content of the image as the alternate text would probably duplicate that 
> caption and render a page that is hard to understand in a text-only 
> browser:
> 
> Frisson: White truffle & wild rice
> Frisson: White truffle & wild rice
> 
> And the viewer has no clue what that means.

Yeah, the caption shouldn't be in the alt attribute.


Maciej wrote:
> 
> Mail.app and other mail clients don't put alt attributes on images 
> generated in email. They could add alt="", but there are two reasons it 
> might be better to allow no alt attribute at all, at least for email 
> clients.
> 
> 1) A mail message is often sent to a restricted audience, so the 
> accessibility, media-independence and machine-understandability benefits 
> or alt are not nearly as great. And adding alt="" as a cargo cult 
> talisman does not give these benefits in any case.
> 
> 2) WYSIWYG editors in general can't be expected to enforce proper alt 
> attributes. Users can add images in all sorts of ways (paste, drag and 
> drop) that don't have a natural affordance for entering alternate text. 
> And I doubt WYSIWYG editors that popped up a box for typing in text 
> whenever the user inserts an image would be competitive

The spec now allows this.


> In general, I think the HTML5 definition of <img> is problematic - it 
> says:
> 
> "The img element represents a piece of text with an alternate graphical 
> representation."

Removed.


> And also:
> 
> "When the alt attribute's value is the empty string, the image 
> supplements the surrounding content. In such cases, the image could be 
> omitted without affecting the meaning of the document."

This is still there (in spirit if not in text) but now there is the 
absence of the alt attribute to use as indicator.


> Let's consider one archetypical use of the <img> tag in the wild, a 
> Flickr photostream. The example below is from my photostream.
> 
> <IMG src="http://farm1.static.flickr.com/ 
> 178/392969604_a0887f39ce_m.jpg" width="240" height="180" alt="">
> 
> I don't think it is right to say that this represents a piece of text 
> with an alternate graphical representation - it represents an image, 
> namely the linked photo. It's also not right to say that the image could 
> be omitted without affecting the meaning of the document.  Although I 
> entered a title of "Frisson: White truffle & wild rice", it would be a 
> strained interpretation to say my photostream page would have the same 
> meaning without any of the photos. Also, Flickr lets me have no title at 
> all, or an ugly title based on the camera- chosen file name like 
> DSC0981.JPG.

Agreed. The spec is now better on this front I think.


> Ian suggested that many uses of images on the web that aren't alternate 
> graphical representations of text should be in the CSS layer. So maybe 
> this should be <div id="photo1"> with a <style>#photo1 { background: 
> url(http://farm1.static.flickr.com/ 178/392969604_a0887f39ce_m.jpg); 
> width 240; height: 180; }</style> somewhere. But that doesn't make sense 
> to me - the photos in my photostream aren't presentational and a 
> stylesheet that replaced them with other images would not preserve the 
> image in the page. Further, browsers often do not offer as good 
> interaction with background images as the contents of an <img> element. 
> For instance, you can't drag a background image from the page to your 
> desktop in any browser I know of. So that choice of markup would suck 
> for the user, in addition to having the wrong split of presentation and 
> semantics.

Agreed.


> Although Flickr isn't what most people think of as a WYSIWYG editor, my 
> choice is carefully considered. Embedding photos is a fairly common use 
> for meaningful images in blog posts, and blog editors should support 
> them effectively.
> 
> So, in conclusion, I think the stated meaning of <img> and the 
> requirement of an alt="" attribute need to be reconsidered in light of 
> end-user-generated content.

Please let me know if you like the new text.


[snip several e-mails that reinforce points that I've already agreed to 
above and for which the spec now caters.]


On Wed, 18 Apr 2007, Charles McCathieNevile wrote:
> > 
> > I think it remains the case that for end-user generated content, there 
> > will often be semantically meaningful images that are meaningful in 
> > themselves and cannot be considered alternate representations of some 
> > piece of text.
> 
> Years of work on accessibility, and a particular focus on authoring 
> tools, suggests that while this is certainly true, there are lots of 
> good ways to enable authors to include an alternate. One of the big 
> frustrations I find with the web today is using assorted tools like 
> wikis and blogsto edit content, and not being able to put useful content 
> for alt where appropriate, and mark it explicitly blank for other cases.

Indeed. I'm not sure what we can do about that here though.


On Sat, 21 Apr 2007, Jon Barnett wrote:
> 
> (Still) Images are used on the web (and in email) in 4 ways:
> 1.  Images that represent text.
> <p>Alt text for this image is <img alt="obvious" src="obvious.jpg">.</p>
> <p>It's been a good day! <img alt=":-)" src="smile.jpg">.</p>
> 
> 2. Images that are content but don't represent text (though they may be
> accompanied by a caption - even if the caption could be the alt text, it
> would be redundant with the caption repeated in the markup)
> <p>These are my vacation photos:
> <ul><li><img src="grandcanyon.jpg">My wife and I at the Grand Canyon...</ul>
> 
> 3. Images that are purely decorative, not content, and don't represent text
> <p><img src="plant.jpg" style="float: right">Potted plants are nice and love
> water...
> 
> 4. Images that are purely background images - they're size isn't necessarily
> integral to layout, and the image may be repeated in the background -
> there's no ambiguity to this.
> <div style="background: url(tiles)"><p>Lorem ipsum...</div>
> 
> It's obvious that the <img> tag should be used for (1) and obvious that 
> CSS background images should be used for (4).  (2) and (3) are the 
> controversial examples.

The spec now handles all these explicitly. Let me know if it's not clear 
enough on anything.


On Sun, 22 Apr 2007, Jon Barnett wrote:
> 
> The "rel" attribute isn't specified for the img element, but this might 
> be a good use for it - what relationship does this image have with the 
> document.
> 
> Thoughts on that, or something new?

I don't think authors would really use this reliably enough for it to be 
usable and useful.


> If the alt attribute is required, what should it be for (2)?  Blank? A 
> paragraph describing the vista of the Grand Canyon?

The new spec doesn't require it.


On Sun, 22 Apr 2007, Jon Barnett wrote:
> 
> "noalt" is a good idea and leaves no ambiguity.

Again, I'm not sure I see any evidence that authors would use this enough 
to make it useful. It seems better to just define the lack of alt as 
meaning this.

The nohref="" attribute on <area>, which is in HTML4 but not HTML5, is a 
similar attribute. I think our experience with nohref="" is that it's not 
a good way of going about this kind of thing.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 14 August 2007 21:00:51 UTC