W3C home > Mailing lists > Public > public-html@w3.org > March 2010

Re: ISSUE-66 Change Proposal: no change

From: Shelley Powers <shelley.just@gmail.com>
Date: Tue, 9 Mar 2010 08:07:11 -0600
Message-ID: <643cc0271003090607x7f0b7dd1vb84618d5910ccd9f@mail.gmail.com>
To: Ian Hickson <ian@hixie.ch>
Cc: public-html@w3.org
On Tue, Mar 9, 2010 at 4:44 AM, Ian Hickson <ian@hixie.ch> wrote:

> There is no problem and the proposed remedy is to change nothing.
> There is no problem.
> One other change proposal says that no technology exists to convert images
> to text. However, this is not true; for example OCR technology has existed
> for decades and is widely available in both commercial off-the-shelf and
> open-source packages.

There is no software that can determine the web page author's intent when
placing an image in a page.

A scenario: a car maker creates an ad page featuring one of its hot new
cars. The car is in a street scene, pulled up in front of a stop sign. The
word "Stop" is legible. OCR's interpretation of the image does not reflect
the ad creator's interest in pointing out how nice the car looks, how sleek,
and fast, No, instead the OCR technology would reduce the entire image down
to one word: stop.

Another scenario: the web page author takes a photo of a bear cub at the
zoo, as it tries to stick its head into a ball that has peanut butter
smeared on the inside. The intent of the photo is to show how cute the cub
is, how tenacious its efforts, how difficult the prize.  Alt text could be
along the lines of, "A very young bear cub, determinedly trying its best to
shove its overlarge paws and nose into a plastic ball that has peanut butter
smeared on the inside--bright pink tongue extended as far as it can to
access the tasty treat."

The best image recognition software: bear holding round object. If there's a
lot of areas of high contrast--dappled light, strong shadows, we probably
wouldn't even get bear -- we'd get some form of animal holding some form of
object. It would never "see" cute. It can't "see" determined.

There is no image software in the world, there never will be, that is fully
capable of understanding _why_ the person who added the photo to the page,
did so. The most any of the most sophisticated, cutting edge applications
can do is determine words, whether appropriate to the intent of the image or
not, or provide a blunt assessment of the image. They can't convey "cute" or
"determined", "fast", or "sleek", because these are subjective values.

An image is more than the sum of its parts.

> That other change proposal also suggests that the spec might make it
> unclear that authors should be the ones that give alternative text, rather
> than automated tools. However, to draw such a conclusion one would have to
> ignore the pages and pages of detailed instructions on how authors must
> write alternative text, and one would have to ignore a big warning placed
> immediately adjacent to the controversial paragraph asserting in no
> uncertain terms that "authors must not rely on such behaviour".
Why muddy the topic, though? With big, garish red letters? Why not keep it
simple: authors, do this. Clean, simple, to the point, without a lot of
extra words, extra text, garish red letters, and vague references to
wonderful technology ...that doesn't exist, and in effect, can't exist.

Why can't we just keep things simple?

> That other change proposal further suggests that we should not suggest to
> implementors that they help users understand images, because they will do
> so without prompting. However, this would be inconsistent with the style
> of the specification, which is to be explicit about everything and to
> leave nothing to chance, especially not something as important as
> accessibility.
I strongly recommend that you re-do your change proposal and include
references to the other change proposals, because I haven't a clue which
change proposal you're talking about here. I'm only aware of one change
proposal: Matt's original. And that's all that shows in the Issues Status

> Another change proposal suggests that not including more detail would be
> missing out on an opportunity to increase competition in the field.
> However, there's no reason to go overboard; just mentioning one simple and
> unambiguously possible technique like OCR should be enough.
ditto -- Matt's change proposal does not reflect this text. If you're
referring to the email discussion, those aren't change proposals. That was
just people saying things.

> Change nothing.
> Leaving the text in will encourage implementors to explore the boundaries
> of alternative text repair techniques, increasing the overall
> accessibility of the Web over time.
> Leaving the text without change might fail to highlight possible future
> work, such as performing landmark recognition or facial recognition in
> photographs, reducing the chances that an implementor will investigate
> these groundbreaking image analysis techniques in the context of
> alternative text repair.
I think we can safely say that we've never seen any company fail to use its
newest, wizziest, coolest technology, just because there's nothing in a
specification that says, "It's OK, you can innovate, now".

> None.
> It is suggested that mentioning that user agents might be able to repair
> non-conforming pages could make authors less likely to write conforming
> pages, though it is not clear why this would apply here and not in the
> many other parts of the spec that mention repair techniques, especially
> the sections that explicitly mandate specific user agent repair
> techniques.
There is a world of difference between repairing an unclosed element, and
determining why Jane or Joe put a picture of a bear with a ball on a web

There are some things that can't be repaired. For these, we rely on people,
scary as that may seem to be.

> Ian Hickson               U+1047E                )\._.,--....,'``.    fL

Received on Tuesday, 9 March 2010 14:07:44 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:59 UTC