Re: Moving longdesc forward: Recap, updates, consensus from Benjamin Hawkes-Lewis on 2011-05-07 (public-html-a11y@w3.org from May 2011)

From: Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com>
Date: Sat, 7 May 2011 12:37:01 +0100
To: John Foliot <jfoliot@stanford.edu>
Cc: "Gregory J. Rosmaita" <oedipus@hicom.net>, Laura Carlson <laura.lee.carlson@gmail.com>, Charles McCathieNevile <chaals@opera.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
Message-ID: <BANLkTinuxXoGD_t0j8si5Jyy4eZZzRx3bw@mail.gmail.com>
On Sat, May 7, 2011 at 2:36 AM, John Foliot <jfoliot@stanford.edu> wrote:
>> Nobody disputes that users needs text alternatives, and HTML5 must
>> provide mechanisms for authors to provide them.
>>
>> The hidden metadata objection simply says that encouraging authors to hide
>> such text alternatives makes text alternatives more likely to be poor
>> quality.

> The history of @longdesc goes back to a time when designers (not engineers)
> expressed a need to provide the functionality that non-sighted users
> required, without it impacting on the visual design of their page/content.

Note @longdesc is not just for "non-sighted" users, which means that what's
actually happening is designers are neglecting all user groups when
they make it harder to discover text alternatives.

Anyway, the upshot of this argument is to concede that @longdesc is hidden
metadata, but imply more text alternatives will be provided if we have a
feature for easily hiding text alternatives than if we don't, so it's worth
the costs to accuracy and discoverability. Which is a plausible argument,
but not the one that the "Hidden metadata fallacy" section makes.

> Nobody has addressed this request/requirement, and nobody has suggested a
> better means of providing for this scenario.

Providing an element containing the long description that would be hidden in
the expected rendering might be a better solution, because hidden metadata kept
beside the image is not subject to link rot. (And an iframe could be
used where authors are adamant about reusing content over the network
rather than stiching it together serverside.)

http://lists.w3.org/Archives/Public/public-html-a11y/2010Dec/0218.html

> If engineers and designers want to provide longer textual descriptions in
> page with their complex image today, then I will both salute them and thank
> them. However, we have use cases and first hand testimonial from some
> designers who patently reject having to do this for design aesthetics, and
> we require a means to address this need as well.

I don't think the mere fact that some authors want to provide text
alternatives without "compromising" (in their eyes) their design
means that we must provide a feature. HTML5 is never going to
fulfill everybody's "wants", and plenty of accessibility requirements
require design compromise, e.g. WCAG 1.4.1 Use of Color.

Addressing this particular want makes sense if it results in a good
tradeoff: enough additional text alternatives versus a small enough increase
in general error rate.

> Hacking away with ARIA and hidden <div>s is a retrograde and
> expensive[*] 'solution' to what is actually a fairly elegant if
> specialized option.

The hidden metadata objection is that authors should show text
alternatives, not hide them.

It's plausible there's a set of authors who are both able to
provide long text alternatives and so adamant about not compromising
their design that they will implement hidden metadata in the form of
off-screen text, even though this severely compromises the accessibility
of the text. e.g. Magnifier and colorblind users won't see anything.
This is certainly an argument in favor of an official mechanism such
as @longdesc, even if it's hard to show that the set of authors is large
enough that it satisifies the tradeoff.

> WYSIWYG tools have made creating and ensuring longdesc content remains
> current and properly associated to their images a simple - even
> simplistic - task,

I think that's a rosy-tinted view.

> and continuing to suggest that broken hand-coded links and bogus
> values for @longdesc will continue to wildly propagate on the web has
> no foundation in fact.

I think that's optimistic, but I certainly think we can propose things
(such as link checker behavior, more discoverable @longdesc
implementations) that will reduce the error rates associated with such
hidden metadata.

> [* in an era where slicing even 10 bytes from a web-page equals hundreds of
> thousands of dollars a year for companies such as Google, transporting all
> that off-screen text "just in case" will meet with even more resistance than
> providing a user-option to follow a link to get that extra text]

I don't think I buy this. I think the problem developers at companies
like this face is getting the text alternatives in the first place, not
fighting to be allowed to include them in the download. In highly
performance-sensitive cases, iframe seamless, DOMContentLoaded, and
postMessage mean that you wouldn't need to delay visible functionality
while loading off-screen text.

>> Here's Tantek's objection from the poll:
>>
>> "[@longdesc] is one of the worst forms of invisible metadata or "dark
>> data"
>> which are known to rot and become inaccurate over time (see: meta
>> keywords, RDF
>> in comments, sidefiles, etc.)."

[snip]

> Link rot is a fact of the web, whether or not the links are referenced via
> @longdesc or via @src.

The argument is about differing error rates.

>> "Such arguments are misguided, because it is far better for
>> accessibility for such issues to be considered a prominent part of
>> the page's design and/or user experience from the outset, and for the
>> accessible content to be treated as a first class citizen of the
>> site.
>
> With no disrespect to Lachlan, many, many working accessibility
> specialists - who's job each day is to work with people with disabilities
> and create content for those users - disagree.  I respect Lachlan's right to
> hold an opinion, but I will also echo back a phrase that has commonly been
> given to the accessibility community: that HTML5 should not be based simply
> upon the word of an expert. Yet here Lachlan is sounding very "expertly": he
> offers no proof of his assertions short of his version of "logic", and he
> lacks any real credentials to support the claim that he is an accessibility
> expert. I know Lachlan, I know Lachlan cares, and he wants the web to be
> accessible, but wanting and espousing a person philosophy are 2 different
> things.

Only arguments and evidence matter. Expertise is manifested through
better arguments and evidence, not appeals to authority.

>> Hiding it behind the longdesc attribute, or any other similar method,
>> effectively treats it as a second or even third class citizen which
>> has been clearly demonstrated to result in suboptimal alternative
>> content that never actually helps those it's intended to."
>
> There has been no clear demonstration of *anything* - we have a 4 year old
> blog posting by Mark Pilgrim (http://blog.whatwg.org/the-longdesc-lottery)
> based upon 'confidential' Google data that no-one can view or challenge.

People can spider the web to collect their own evidence to reproduce or
falsify that study.

> I will happily concede that in 1999 - 2004 many authors were unaware of how
> to use @longdesc properly, and likely many documents that Ian crawled from
> that time period returned poor results. So what?  I can also point to likely
> just as many documents from that time-frame that had nested tables 14 levels
> deep, because authors back then didn't know any better about that either.
> What does that prove? It proves that author awareness and best practices
> have improved considerably over the past 5 years. I am extremely confident
> that after all of the discussion around @longdesc and it's place in HTML5
> that has been generated, that lack of awareness on how to do it right will
> be a thing of the past.

Sorry, but I don't think that all the discussions around @longdesc in
our spec-writing circles have primed the authoring population to use and
write good @longdesc values. Over the long haul, what we put in the
spec, what user agents and authoring tools implement, and what
beginner-level tutorials say will have a web-scale impact though.

> Moving forward, "bad" longdesc will be symptomatic of one of 2 things:
> apathy or lack of education. One we can fix, the other, not so much.

I think it will also continue to be a symptomatic of the higher error
rate of hidden metadata.

>> As such, I would expect a "hidden metadata fallacy" section to put
>> forward arguments in favour of any of the following:
>>
>>   * Errors in visible data and hidden metadata are equally likely to be
>>     corrected. (This seems obviously false.)
>
> Obvious how? And what kind of "errors" are you talking about?

Any sort. Broken links. Poor quality data. Incorrect data. Spam.

>>   * In the special case of long text alternatives, authoring visible
>> data is likely to compromise quality more than hiding it because of the
>> risk that authors write captions that assume you can see and make sense of
>> the image.
>
> Captions are not long descriptions.

Sorry for not being clearer. I'm saying you could construct a plausible
model like this:

   1. We encourage authors to write long text alternatives that are
visible.
   2. They end up thinking about their mainstream sighted audience
when writing the long text alternatives and forgetting about people
who cannot see the picture.
   3. We end up with captions masquerading as long text alternatives.

i.e. This is a case where visibility could actually negatively
impact quality in some authoring scenarios.

> Does it take an extra author step to verify that the textual
> description matches the image it is referenced by? Yes. That doesn't
> make it invisible

Yes it does. Again: *all* metadata can be made visible; but data that is
already visible is less error-prone.

>>   * More authors will provide text alternatives if they can hide them,
>> and this
>>     is worth the accuracy cost of helping them to do so.
>
> I believe that Laura's use cases, and the quote from Kyle Weems (CSSquirrel)
> addresses this point. Are you suggesting we need more "voices" making this
> assertion?

I'm suggesting that the "Hidden Metadata Fallacy" section does not make
this argument or any other useful argument.

I doubt anyone's going to be convinced by a single quotation when set
against the current @longdesc lottery. More testimony from people
who are adamantly against providing a visible alternative or visible
link but who actually do provide text alternatives would be good.

I think implementation examples like TellMeMore that highlight how we
can make @longdesc *more* discoverable and hence suffer *less* from
being hidden metadata, are more likely to persuade.

If we could find some comparable case from the world of laborious
human-authored hidden metadata where better discoverability in user
agents ultimately proved a success story, that would be good.
Might be hard:

http://www.well.com/~doctorow/metacrap.htm

>>   * User agents could excude bad data.
>
> Huh??

Sorry, that was meant to say "exclude bad data" referring back to
what the Chairs said: we could demonstrate that user agents
could exclude bad data. (This might be harder than we think.)

>>   * Users will not give up on @longdesc if they repeatedly encounter
>> bad @longdesc values.
>
> This cannot be proven one way or the other: this requires a crystal ball
> that no one has.

We must write a spec on the basis of predictions. Predictions cannot be
proven in advance. Instead you must balance more or less plausible
predictions. This prediction is premised on a plausible model of human
behavior.

> What we can suggest however is that awareness of longdesc, and the value it
> can offer users, is reaching an increased awareness,

Dunno what makes you think that's happening at any sort of web scale.

> and with that awareness we will likely see both increased authoring
> and consumption.
> (http://webaim.org/projects/screenreadersurvey3/#longdesc)

Maybe. It will bad if the increased authoring is erroneous.

>>   * User agents will not stop implementing @longdesc if users
>> repeatedly encounter bad @longdesc values.
>
> User agents don't "implement" @longdesc, authors do.  User Agents
> support @longdesc; at least some do, some don't. I am not sure what
> you are saying here.

Sticking @longdesc in a context menu or whatever is an implementation.
I find your distinction between "implement" and "support"
incomprehensible.

>>   * Better implementations will make it easier to discover @longdesc
>> visually, reducing the error rate due to it being hidden from the normal
>> rendering of the page itself.
>
> Better GUI User Agent support for discoverability will likely have this
> effect. This can and is asserted, but cannot be proven - it is a future and
> forward-looking statement.

Arguments and evidence for models of how the world works makes
predictions based on those models more or less plausible. We seek to act
on more plausible predictions. Where the model of how the world works is
contested, it makes sense to provide such arguments and evidence in our
CP.

>> But instead, the "Hidden Metadata Fallacy" section lays seige to a
>> nonsenical strawman argument asserting that hidden metadata is not
>> discoverable at all.
>
> What would you write then Ben?

I'd drop the section, acknowledge that @longdesc pointing to hidden or
off-page text is hidden metadata, and try and marshall evidence that
with better discoverability the tradeoff will work out better than
the alternatives.

> If you *do* agree that the Hidden Metadata
> argument is a strawman,

I should really just avoid the phrase "strawman argument" as too
ambiguous, as that's the inverse of what I'm saying.

I'm saying:

   1. The principle of visible data ("all other things being equal,
      visible data has a lower error rate than hidden metadata") is obvious.

   2. The "Hidden Metadata Fallacy" section critiques a claim
      that nobody believes ("hidden metadata is not discoverable").
      This is the strawman.

   3. Therefore we should drop the section.

   4. @longdesc suffers from the drawbacks of hidden metadata, but might
      have counterbalancing benefits. The remainder of the CP does
      try to point to some benefits. Perhaps we can strengthen
      it.

> But sometimes they do - Flickr being one such case:
> http://www.flickr.com/photos/benward/2109344239/
> (This photo was taken on December 12, 2007 in Holborn, London, England, GB,
> using a Canon Digital IXUS 65.)
>
> That last bit of data (date, location, camera), that was exposed in the
> clear on the page I linked to, and is clearly not "hidden".

That's visible data not hidden metadata in the context of that
webpage.

> It was discoverable.

[snip]

> Further, I can extract even more data using a specialized tool

[snip]

> The problem with @longdesc today is not the attribute, it's the browsers'
> failure to make it discoverable to end users.

Hidden and discoverable are not exclusive categories.

The principle of visible data and arguments against @longdesc based on
that principle take it as given that hidden metadata like @longdesc or
meta keywords is discoverable. So banging on about how @longdesc is
discoverable is utterly pointless; it's a tautology.

By contrast, it is important to bang on about how we can make @longdesc
/more/ discoverable, because more discoverable metadata is less
likely to be bogus.

--
Benjamin Hawkes-Lewis
Received on Saturday, 7 May 2011 11:37:30 UTC