Re: Need differentiator between "no alt text provided" and "no alt text necessary"

On Feb 2, 2009, at 8:52 PM, Ian Hickson wrote:

> On Mon, 2 Feb 2009, James Craig wrote:
>>
>> Are you referring to the sentence that starts: "If the image is not
>> available or if the user agent is not configured to display the  
>> image,
>> then…"?
>>
>> If so, that sentence does not address the non-visual equivalent of a
>> displayed image. It only addresses the cases of a broken or missing
>> image, or the inability to display an image. Those cases are  
>> necessary
>> but, as far as I can tell, irrelevant to this thread.
>
> Isn't the case you are talking about covered by "if the user agent  
> is not
> configured to display the image"?

Not really. Most assistive technology (like screen readers) works in  
tandem with a capable, modern user agent. For example:

<img src="foo.gif">

Assuming the image "foo.gif" is not broken or missing, it will be  
displayed in the user agent, so the above case does not apply.  
Assuming there is no other markup (like a heading or legend), and  
given that no alt attribute was provided, the user agent will not be  
able to determine the alternative text from the markup. At this point,  
the user agent should be able to make an assumption (hence the request  
for clarification) about whether or not the image is meaningful or  
presentational.

The current wording ("the image might be a key part of the content")  
is ambiguous enough that a user agent cannot make that assumption. If  
the wording were more explicit to indicate whether or not <img  
src="…"> was a meaningful image, the user agent could continue:

1. In the case of a presentational image, no additional recovery is  
necessary, and the default  semantics of that element (mainly that it  
is an image) need not be conveyed to the user.

2. In the case of a meaningful image, there are several ways that  
assistive technology (in conjunction with a user agent) may try to  
recover. For example, if the file name was "icon_home.gif" then the  
user agent may attempt to convey "icon home" as a substitute for the  
alternative text. It's less likely this will be useful if the filename  
isn't even close to a real word in the user's native language, like  
"X23LP7F.png". If the user agent has the capability for optical  
character recognition, or has access to an online API that offers OCR  
as a service (grid computers will probably always be able to read  
images better than a web browser) then it may attempt to read the text  
content of the image. There may be other recovery mechanisms to  
determine the gist of the content, such as whether or not the image  
contains a face like the Google image search now does: http://images.google.com/images?q=bob&imgtype=face 
  Anyway, none of this is necessary if the user agent can assume (from  
an explicit rule of the language) that an image is presentational.


>> Would you be willing to add this clarification (as previously  
>> proposed
>> by Maciej) of how user agent handle images without an alt attribute?
>> Since it's an RFC 2119, I assume it would need to go in a normative
>> section.
>>
>> "User agents MAY (or SHOULD) assume images without an alt attribute  
>> are
>> a key part of the content lacking a textual equivalent."
>>
>> That would then clearly indicate the following:
>>
>> 1. <img src="…" alt="">		Presentational image.
>> 2. <img src="…" alt="foo">		Meaningful image, alt text provided.
>> 3. <img src="…">			Meaningful image, no alt text provided.
>>
>> Currently case #3 is unclear. It might be a meaningful image with  
>> no alt
>> text provided, or it might be something else entirely. I realize that
>> we'll always have authors that do not use alt correctly, but we  
>> should
>> have a clear definition of what it means when they do.

I might even go as far as to add an author requirement such as,  
"Authors SHOULD NOT omit the alt attribute if is known to have  
appropriate alternative text. A missing alt attribute indicates that  
the image is meaningful, but no alternative text is known by the  
author or authoring tool."

> I'm glad you agree that cases 1 and 2 are covered. Specifically,  
> they are
> handled by the first two entries in the list after the sentence that  
> says
> "What an img element represents depends on the src attribute and the  
> alt
> attribute".
>
> Case 3 is covered by the next entry in that list, entitled "If the src
> attribute is set and the alt attribute is not".
>
> First, that section defines what this means:
>
> # The image might be a key part of the content, and there is no  
> textual
> # equivalent of the image available.

As I stated above, the wording "might be" is too ambiguous.

> Then, it says what user agents that _do_ support images must do:
>
> # If the image is available, the element represents the image  
> specified by
> # the src attribute.

Unless there is alternative text available in the image file, such as  
in SVG, this is meaningless to assistive technology. I'm not even sure  
what you're trying to say by the phrase, "the [img] element represents  
the image." Of course the img represents the image! Isn't that  
redundant?

> (Extra text in the rendering section will elaborate on this.)
>
> Next, it says what user agents that do _not_ support images must do:

Perhaps this is where the confusion lies. If "user agents that do not  
support images" is intended to include "user agents that do support  
images but are being accessed by users who cannot see the visual  
representation of those images", then this sentence should be rephrased.

> # If the image is not available or if the user agent is not  
> configured to
> # display the image, then the user agent should display some sort of
> # indicator that there is an image that is not being rendered, and  
> may, if
> # requested by the user, or if so configured, or when required to  
> provide
> # contextual information in response to navigation, provide caption
> # information for the image, derived as follows:
> #
> # 1. If the image has a title attribute whose value is not the empty
> #    string, then the value of that attribute is the caption  
> information;
> #    abort these steps.
> #
> # 2. If the image is the child of a figure element that has a child  
> legend
> #    element, then the contents of the first such legend element are  
> the
> #    caption information; abort these steps.
> #
> # 3. Run the algorithm to create the outline for the document.
> #
> # 4. If the img element did not end up associated with a heading in
> #    the outline, or if there are any other images that are lacking  
> an alt
> #    attribute and that are associated with the same heading in the
> #    outline as the img element in question, then there is no caption
> #    information; abort these steps.
> #
> # 5. The caption information is the heading with which the image is
> #    associated according to the outline.
>
> This is a pretty detailed set of requirements already.

You are correct, it is detailed, but it still doesn't define whether  
or not an image should be treated as presentational if no "caption  
information" is found.

> Could you elaborate
> on what your proposal:
>
>> "User agents MAY (or SHOULD) assume images without an alt attribute  
>> are
>> a key part of the content lacking a textual equivalent."
>

> ...is intended to require?

Sorry, this should have said, without an alt attribute and without  
otherwise associated alternative text. FWIW, I'm using "alternative  
text" in the same way you're using "caption information." Perhaps this  
should just be the last step in the caption info algorithm:

# 4. If the img element did not end up associated with a heading in
#    the outline, or if there are any other images that are lacking an  
alt
#    attribute and that are associated with the same heading in the
#    outline as the img element in question, then there is no caption
#    information; abort *steps 4 and 5*.
#
# 5. The caption information is the heading with which the image is
#    associated according to the outline.
#

6. If no caption information can be determined from the previous  
steps, user agents MAY assume the image is a key part of the content  
lacking a textual equivalent in the document, and MAY use other  
recovery mechanisms to attempt to determine the text equivalent.

Received on Tuesday, 3 February 2009 08:13:59 UTC