Re: [html4all] the alt attribute debate from Henri Sivonen on 2007-09-26 (www-archive@w3.org from September 2007)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Wed, 26 Sep 2007 09:32:20 +0300
To: Steven Faulkner <faulkner.steve@gmail.com>
Cc: "advocate group" <list@html4all.org>, "John Foliot - WATS. ca" <foliot@wats.ca>, "Anne van Kesteren" <annevk@opera.com>, www-archive@w3.org
Message-Id: <E2F84572-888B-42C6-987B-7133A1E06C9B@iki.fi>

Hi,

On Sep 25, 2007, at 18:35, Steven Faulkner wrote:

> >Those who add a bogus alt for validation are a subset of people who
> >include a bogus alt.
>
> and what size is this subset (who knows)

Presumably the population whose behavior is swayed by what is deemed  
valid (i.e. syntactically correct) is significant enough for the  
html4all group to be concerned about what validity says about the alt  
attribute.

> why not develop a validator that looks for and fails the page if it  
> has bogus alt?

Because it only leads to an arms race-like escalation that makes more  
junk to be served to users.

If I make the validator check that each image has an alt text that is  
longer than the empty string, those generators that do not have an  
alternative text available, will be programmed to emit a bogus string  
that is at least one character long.

If I make the validator check that each image has an alt text that is  
longer than on character, those generators that do not have an  
alternative text available, will be programmed to emit a bogus string  
that is at least two characters long.

And so on. (The point stays the same if you substitute another  
heuristic for the length test.)

This is not only an issue with alt. There's non-alt precedent to this  
kind of behavior. When HTML 4 said that paragraphs must not be empty,  
people who saw value in emitting qualitatively empty paragraphs  
started putting a no-break space in paragraphs that were  
qualitatively empty. To address this, Hixie went ahead and defined a  
concept of Significant Inline content to make a single no-break space  
invalid in HTML5. Will that make people no longer see value in empty  
paragraphs. My bet is on "No". They'll just generate something that  
fools the new test. I've suggested Hixie that instead of trying to  
outsmart people who want to do "bad" empty paragraph we stop the  
escalation instead.

When HTML 4.01 Strict banned target='', it didn't make people not to  
want to open links in new windows any longer. Instead, it begat this:
http://www.alistapart.com/articles/popuplinks

Moral of the story: You can't use the concept of validity to stop  
people from achieving the results they want. They just figure out a  
way to do it in a less detectable way which means browsers have a  
harder time offering counter-measures to the users.

> using heauristics, that shouldn't be so hard should it?

Heuristics work when the uncooperative data sources are indifferent  
to the heuristics (that is, they don't actively try to fool the  
heuristic). This, presumably, would be the case if alt-related  
reasonable heuristics were deployed in AT. The heuristic of reading  
the URI is so bad a heuristic that you have effectively been arguing  
that authors defeat that particular heuristic.

Heuristics don't work when the uncooperative data sources are  
actively hostile to the heuristic (i.e. try to fool it) *and* they  
know what the heuristic is so that they *can* fool it. This is why  
search engines keep their anti-SEO spam heuristics secret and complex  
enough to be resilient to black-box reverse engineering.

Precedent suggests that in the case of validators, people will seek  
to fool them if the concept of validity stands in the way of the  
results they want and they have a requirement (for whatever reason)  
to be valid.

Now I'm going to shortly follow up to John Foliot's email and then  
excuse myself and write some validator software instead of talking  
about it.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Wednesday, 26 September 2007 06:32:41 UTC