W3C home > Mailing lists > Public > whatwg@whatwg.org > October 2013

Re: [whatwg] @generator-unable-to-provide-required-alt, figure with figcaption

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 18 Oct 2013 21:03:02 +0000 (UTC)
To: "Jukka K. Korpela" <jkorpela@cs.tut.fi>
Message-ID: <alpine.DEB.2.00.1310182012100.1896@ps20323.dreamhostps.com>
Cc: whatwg@lists.whatwg.org
On Wed, 4 Sep 2013, Jukka K. Korpela wrote:
> 2013-09-04 0:09, Ian Hickson wrote:
> > > To a user, even “(an image)” is better than lack of alt attribute
> > 
> > I disagree. The lack of an alt attribute can be used by user agents to 
> > substitute the string "(an image)", in which case it is the same, or 
> > it can be used to do far more, e.g. image recognition, OCR, etc. This 
> > isn't academic, these technologies exist today.
> 
> There is nothing that makes that makes that impossible, or more 
> difficult, if the element has an alt attribute.

If the element _has_ alternative text, the presumption is that that text 
is the equivalent of the image, and therefore replacing that text with the 
results of OCR is likely a step backwards. That is, in the ideal scenario, 
a user shouldn't have to ever know they were missing images; nor should 
they in any way suffer from missing those images, because their textual 
alternatives would convey the same information seamlessly.

It's only when the alternative text is inadequate that you want to be 
falling back to OCR, etc. And the best way to indicate that it's 
inadequate, is to not include it at all.


> If you mean that programs would actually do such things if and only if 
> the alt attribute is absent, then this is very speculative. Let’s worry 
> about that when browsers are actually capable and willing to do such 
> things at all.

They are capable now; like I said, this is not academic, this technology 
exists and is deployed by some of the browser vendors (just not in the 
browser products yet). We are worrying about it now because we put it off 
until now and now is the time to worry about it.


> There is an essential difference between lack of an alt attribute and a 
> more or less generic value used for it, as in alt="(an image)" or in 
> alt="(image: horse5)" (automatically generated e.g. from an image URL 
> that ends with horse5.png) or in alt="(photo of Hixie)". Lack of the alt 
> attribute says absolutely nothing about the image; it might represent a 
> word as an image, or be pure decoration, or be so complicated that 
> writing a textual alternative would be major challenge in content 
> production.
> 
> Someone who hears, says “image – horse five” at least gets some idea of 
> what the image is about, and even “an image” as opposite to whatever a 
> speech browsers says about <img ... alt=""> is an improvement: the user 
> can know that the author tried to find a textual replacement for the 
> image but couldn't.

I disagree. It's not better to hear "[image indication from AT] an image" 
than it is to hear just "[image indication from AT]". If you want to give 
it a caption, you can put it in the title="", where a competent AT can 
then say things like "[image indication from AT] Image of Hixie", and a 
really competent AT can say something like "[image indication from AT] 
Image of Hixie. A smiling face with brown hair and a beard". But if you 
have an alternative text of "(photo of Hixie)", then a UA that outputs 
that as "[image indication from AT] photo of Hixie. A smiling face with 
brown hair and a beard" would have to make the experience in _well_ 
written cases dramatically worse than ideal. For example, it would make a 
case like "Multiply <img src="theta.gif" alt="π"> by two" turn into 
something like "Multiply [image indication from AT] Pi. A two dimensional 
line art picture of a table. By two" or "Multiply [image indication from 
AT] Pi. Pi. By two", which are both significantly worse than the ideal, 
which would at worst be "Multiply [image indication from AT] pi by two" 
and at best just be "Multiply pi by two".


> > To the non-validator user agent, that attribute means nothing. It's a 
> > non-conforming attribute with no semantics to any software outside 
> > content generators and conformance checkers.
> 
> It is presented as a non-conforming attribute that can be used to get a 
> clean validation report, i.e. to make a validator report a document as 
> valid, as conforming.

Correct.


> This is grossly illogical and misleading.

Granted.


> Anyone who uses a validator has the right to know whether the document 
> is valid or not, to the extent that this can be programmatically 
> determined. And it is, if the attribute is not valid.

Indeed. That's why validators are allowed to point it out. There's an 
explicit statement to that effect, and there's no explicit requirement to 
not report the missing alt="" in the first place (the requirement is 
phrased the other way around quite carefully).


> Here's a proposal:
> 
> The character U+FFFC OBJECT REPLACEMENT CHARACTER, which is “used as 
> placeholder in text for an otherwise unspecified object” [the quote is 
> from the code chart entry in the Unicode Standard] be used as the value 
> of an alt attribute to indicate that it was not possible to write an 
> adequate alternate text for the image. This typically means that the 
> image comes from a source external to the system that generates the HTML 
> document and the system cannot analyze it or otherwise find a suitable 
> text replacement.

This would have the exact same "grossly illogical and misleading" problem 
-- it would have to be non-conforming, while still requiring that it be 
output by over-constrained editors and encouraging validators to keep 
quiet about it -- but it would have the added disadvantage of not at all 
conveying its status to readers of the document.


> – It’s of course possible that people would then use alt="&#fffc;" to 
> silence validators even when they could easily write real text there. 
> But they can anyway use alt="" for such purposes if they want to.

They could, but note that for non-decorative images, that'd be 
non-conforming as well. There's just no way for us to detect it. It's 
pretty much the same as the long attribute, except more misleading.

(Note that the point of the attribute is to prevent editors from doing 
exactly this, since it is harder to detect and less likely to be noticed 
by authors.)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 18 October 2013 21:03:27 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 17:00:12 UTC