RE: Why I don't attend the weekly teleconference (Was: Input on the agenda) from Murray Maloney on 2009-07-02 (public-html@w3.org from July 2009)

From: Murray Maloney <murray@muzmo.com>
Date: Wed, 01 Jul 2009 23:54:50 -0500
To: Ian Hickson <ian@hixie.ch>
Cc: Murray Maloney <murray@muzmo.com>,public-html@w3.org
Message-Id: <5.1.1.6.2.20090701163128.0b794c88@mail.muzmo.com>
At 11:20 PM 6/29/2009 +0000, Ian Hickson wrote:
>On Mon, 29 Jun 2009, Murray Maloney wrote:
> > At 09:01 PM 6/28/2009 +0000, Ian Hickson wrote:
> > > On Sun, 28 Jun 2009, Murray Maloney wrote:
> > > >
>
>This isn't about browser implementations necessarily; it's about whatever
>implementations are relevant to the feature. In the case of "axis",
>"longdesc", or "summary", for instance, it might be ATs rather than
>browser vendors. In the case of "h1", it might be browsers, ATs, and
>search engines. In the case of "itemprop", it might be primarily data
>mining tools. Some requirements in the spec are only relevant to validator
>implementors.
>
>However, if the relevant audience of implementors doesn't implement a
>feature, then I don't think it's hyperbole to call it science fiction. For
>instance, HTML4's <object declare> feature is not supported by any of the
>major browser vendors, any of the search engines, and any of the ATs. It
>isn't a feature that Web authors can use and actually have third-party
>software support, because nobody in fact supports it. It is, IMHO, science
>fiction.

Maybe I'm confused, but I understood that JAWS and others AT software 
actually did implement summary and longdesc.


>My goal in writing the spec is to not have any features that are ignored
>by the relevant implementors (ATs, search engines, data mining tools,
>browsers, conformance checkers, whoever the particular requirement applies
>to). If the relevant implementors ignore the feature, then yes, it is
>science fiction.

Again, maybe I'm confused, but I understood that JAWS and others AT 
software actually did implement summary and longdesc.



> > It is true that browsers can chose to ignore features within HTML. That
> > does not render the features obsolete to other kinds of tools, or user
> > agents. Can we agree on this much?
>
>Absolutely.


So, if AT uses these attributes and browsers choose to ignore them, what is 
the problem?

> > Can you agree that browsers are not the only viewports onto HTML?
>
>Of course. The HTML5 spec lists six conformance classes explicitly, and
>these further break down into many more types of implementations.
>
>
> > If the universe could provide you with evidence that there is sufficient
> > useful data extant to justify the existence of longdesc and summary, 
> how much
> > data would that have to be?
>
>I don't know, it's a judgement call.

That is not a very satisfying answer. I was hoping for a statement of 
principle that the WG could discuss and come to some resolution.


>In the case of summary="" and longdesc="", it's not such much the
>existence of good data that matters, so much as the fraction of the total
>data that is good. (Psychological studies regarding what fraction of a
>user's experiences can be bad before the user stops being willing to risk
>the bad experiences in the hope of a good one would be helpful in guiding
>us here.) Also, if the bad data can be algorithmically filtered, then that
>makes the barrier lower.
>
>The amount of bogus data in alt="" attributes is still high, but it's low
>enough for the attribute to be useful still. So that's probably the kind
>of bar we're looking at. I don't recall offhand exactly what the numbers
>are for alt="" usefulness, but it's still pretty low. (Low double digits
>percentage of the total number of images with alt="" at all? I forget.)
>
>
> > If the Wall Street Journal and its sister news providers began providing
> > all of its feeds with useful AT metadata, would that tip the scale? What
> > if several state/national education systems were to make their curricula
> > to their students available with useful AT metadata? What if state/
> > federal financial reports employed [...]
>
>If even a single one could do this (organically, i.e. not just because
>someone interested in the outcome of this discussion convinced them to do
>it), and did so in a way that showed that summary="" data was better off
>hidden from non-AT users, that would certainly be significant.

I'm confused again. First you indicated that you would need to see 
sufficient correct markup and content to justify the attributes' 
continuation. Next, you place a condition that seems to say that if someone 
were able to convince the WSJ to do it, then you might suspect the result 
because it was not 'organic'. Finally, you go back to what I consider to be 
a red herring about the data being hidden from non-AT users.

I tried to explain, in an earlier message to David Singer, why 'hidden 
data' is a red herring. I did receive an acknowledgement that nobody ever 
said that summary or longdesc had to be hidden -- that is just what 
browsers do, for whatever reason. AT, as you stated in one of the 
application classes. I don't see why it matters that browsers don't deliver 
this information to its users. But I don't think that other application 
classes should be restricted because browser developers don't want to 
implement that part of the spec. PLease explain yourself for all of our 
benefit.



> > There's more than one way to skin a cat, and if all it takes is to get
> > somebody to turn on a bit of XSLT and populate a few web sites, then
> > maybe you can get your data and everybody will be happy. So, seriously,
> > how much data from a legitimate publisher would warrant a reversal of
> > your position.
>
>Any data at all showing summary="" or longdesc="" being organically used
>in a useful manner specific to ATs and not other users would be
>significant. (So far, the examples of summary="" being used organically in
>a positive way have actually been cases where the text in the sumamry=""
>attribute would have been useful to non-AT users also.)

Again, nobody except for browser implementors have said that the content of 
these attributes should be hidden from non-AT applications. Quite the 
contrary. This is a red herring. Let's catch it and fry it up!

> > > > Moreover, the proponents of both summary and longdesc disagree with
> > > > your assessment.
> > >
> > > Disagreeing by assertion the results of objective studies isn't "fair"
> > > either. I could assert that the financial markets have done nothing
> > > but grow in the past three years, but presumably you would dismiss
> > > such statements as groundless. This is, IMHO, no different.

That's fair dinkum.

Anyone out there with a corpus of work that uses longdesc and summary 
usefully and correctly? Care to produce a study that demonstrates the 
efficacy of these attributes?


> > With these AT attributes, the accessibility community is trying to
> > educate publishers and it is taking a long time.
>
>I think fundamentally that approaches to accessibility that rely on
>education are basically doomed. We need to have accessibility be much more
>automatic than that. We need to make it easier to write accessible pages
>than to not do so, even for people who don't care about accessibility.
>This is why, for instance, we have separation of presentation from
>semantics as such a core feature in HTML5 (and HTML4) -- it's not called
>out as an accessibility feature, but it gets authors into the mindset of
>thinking of what they mean, not what they want it to look like, and that
>helps AT users.

"Approaches to accessibility that rely on education are basically doomed." 
Ian Hickson, 2009

Does the WG agree with this principle?


>When we _do_ still need to rely on education, IMHO we should do so in a
>way that leads to really simple rules.
>
>Instead of:
>    "Describe the structure of your table in a few sentences in the
>    summary="" attribute."
>...have:
>    "All tables should be explained in their <caption>."
>This then helps both AT users _and_ users with cognitive difficulties
>_and_ users who aren't familiar with the subject matter _and_ is done in a
>way that is immediately verifiable by the author.
>
>This seems to me to be a net win.

Not to me.

First of all, you are conflating the editorial and typographic element 
known as a caption -- for which editorial and typographic style guides have 
precedence over the HTML spec. Secondly, you are asking authors to insert 
speed bumps into the content of their captions or related prose. Thirdly, 
all of this would seem to entail 'education', which you have already 
indicated is doomed to fail.


>Similarly, instead of:
>
>    "For important images, add a longdesc="" attribute with a link to a
>    page that describes the image."
>...have:
>    "Make sure important images are described in the prose."
>...or:
>    "For important images, add a link to a page that described the image."
>
>This way authors don't have to learn a new technique (longdesc=""), they
>can just continue using the techniques they use every day, like <a
>href="">. This leads to the information being available to everyone, not
>just ATs, _and_ leads to the author _seeing_ the information and thus
>increases the likelihood that the information will be reviewed.

Hmmm. More likely that authors will not include a long description or a 
summary because style guides would prevent them from doing so.

> > [Frankly, the browser vendors could win a lot of good press by stepping
> > up with the big publishers and provide better accessibility for
> > everybody. A well-written table summary could help a lot of people,
> > especially in financial reports.]
>
>I agree entirely. We should make these summaries available to everyone,
>though, not just AT users.

Where 'we' is the browser implementors? I don't disagree.

Browser folks, wanna step up and solve this part of the problem?

> > > > I could agree that the publishing market has not yet adopted these
> > > > features as fully as the AT market and its supporters would have
> > > > liked.
> > >
> > > The problem isn't so much lack of adoption so much as the
> > > overwhelmingly incorrect use of the features when they _are_ used.
> >
> > Which relative percentages could be overcome tomorrow if the right
> > publishers flip a switch.
>
>Agreed. But will that happen?

I don't know. I am spending my time discussing this in this forum in the 
hope that somebody will flip that switch and bring some helpful data to the 
table.

> > But the technology has not matured due to a social problem, not a
> > technical failure.
>
>If anything, that makes it worse -- social problems are far harder for us
>to fix than technical failures. We can't just ignore the social problems, 
>we need to route around them.

I agree, but trying to force wheel chairs to use the stairs is not helping.

> > > Consider another attribute, like "axis". This is an attribute intended
> > > for accessibility purposes, just like "summary". Is _it_ mature?
> > > Should we keep it? Drop it? Why?
> >
> > Well, that's not fair either. :-) Axix/axes happens to be a favorite of
> > mine and was the subject of a chapter in an SGML book I completed for
> > Yuri Rubinsky in 1997. If browsers today were able to process axis/axes
> > and its use were adopted more widely it would aid the comprehension of
> > tables.
>
>If. :-)

Ya, I know. I have no power to get browsers to do anything. I used to. In 
Panorama, axis/axes was demonstrated. Get a copy of the book that I 
finished with Yuri Rubinsky. I think that it's still available on Amazon.

It's a bit frustrating that today's browsers still don't do some things 
that Panorama did in 1995.

> > I would keep it in because it costs you nothing to include a feature
> > that you do not expect many/any browsers to implement.
>
>Every feature has a cost, e.g.:
>
>  - documentation in the spec
>  - writing of test cases
>  - review of test cases
>  - tutorials
>  - time spent by authors determining if the feature can be used or not
>
>We shouldn't ever make the mistake of assuming a feature costs nothing.

Fair enough. I guess I meant no incremental cost.

> > If/when they do, users of such tables would benefit immensely.
>
>If/when an implementation wants to have this data, then we can add it to
>the spec. In the meantime, if we have the feature but there are no
>implementations, the data in the attribute is just going to be bogus
>(because anything that doesn't get tested is much more likely to have
>unchecked and thus undiscovered errors).

Again, I understood that JAWS and other AT did use these attributes.

>[...]



> > Can you agree that longdesc and summary are not in themselves faulty and
> > that the real problem is a social problem related to lack of useable data.
>
>Sure. The end result is the same, though.
>
>
> > > Unfortunately, it has been demonstrated that this particular approach
> > > doesn't work in wide deployment on the Web, because of the small
> > > fraction of the authoring base who specify these attributes, a large
> > > proportion specify useless values that are hard or impossible to
> > > programatically distinguish from useful values, and thus these
> > > attributes in fact end up _not_ being easy for an application to
> > > ignore unless they ignore them wholesale (at which point the value of
> > > the attribute is lost, and we would be doing authors a favour by
> > > letting them know that providing the attribute at all is a waste of
> > > time).
> >
> > But that argument will apply to whatever solution is proffered, won't
> > it.
>
>It doesn't seem to apply to the proposed <caption> solution, since that
>would get seen by authors and thus reviewed, and thus not contain bogus
>data in anyway near as many cases, and thus wouldn't need to be ignored
>in the first place.

Except that authors are not going to pollute there captions with a detailed 
description of a photo. That's a speed bump. Readers will ask "Why is the 
author condescending to me by describing exactly what I am seeing with my 
own eyes, and why is the caption bigger than the darned photo?"

> > We can never be sure that a text input attribute will contain the right
> > information unless we so constrain the attribute as make it unusable as
> > a general text container.
>
>As you say, it's a social problem. We can dramatically increase the odds
>of the data being not bogus by making it visible to authors.

You keep saying that these attributes are not visible to authors. How then 
do they input the attribute values? I have worked with several HTML 
editors, all of which has a way for me to see and edit the content of any 
attribute. If some don't, so what? Isn't it up to the editing tool 
implementors to decide how to support HTML features?


>Anecdotally (I haven't got precise numbers to give on this, but it's based
>on the random studies I've done looking at the Web), text in attributes
>that aren't visible to authors at all (longdesc="", summary="") is largely
>bogus, text in attributes and features that are visible to authors if they
>go out of their way (title="", <title>) is bogus some of the time and
>useful some of the time, and text that is always visible (<h1>, <p>) is
>usually useful.
>
>I would hypothesise that there is a direct correlation between the quality
>of data and the extent to which it is visible to the page's maintainer.

Interesting hypothesis. I frequently encounter href= values that are wrong. 
In fact, one of the first things we did at SCO in 1993 while building the 
UNIX/X11/Motif doc set was to write a tool to test href values to make sure 
that we didn't have any broken links. But even though we could then be 
certain that all href values were connected to a URI that was valid within 
our doc set, we still couldn't test that each link was going where the 
author intended, or even that the author's intention was appropriate.

>[...]


> > > > It is true that those attributes will be misused on some/many/most
> > > > HTML pages, just as other HTML attributes are often misused. But
> > > > that doesn't mean that it won't be useful when it is.
> > >
> > > Actually, that's exactly what it means. When the overwhelming majority
> > > of the data is bogus, you cannot know when it is not, and thus even
> > > good values become useless.
> >
> > You assert that I cannot know. But there are ways that I can know that a
> > given publisher, perhaps the one through whom I receive my Reader's
> > Digest, is providing me with useful data.
>
>Granted, you could know from experience that a site has good data, and you
>might be tempted to check based on the claims of someone you trust. But as
>a general rule, when you go to a random Web page, you don't know, and
>can't know, and more importantly the user agent has no way to know and
>thus can't do anything on behalf of the user (e.g. automatically chosing
>whether or not to use the available data).

Yep, all of that is true. That truth, however, also applies to the entire 
content of a web page. When I go to ay random web page, neither I, nor the 
UA assures me that every href= value is useful. Even so, users have come to 
depend on those pages having traversable links. It wasn't always thus, as 
you know. Moreover, spam sites often try to mislead you into thinking that 
the URI shown on screen is the actual target address. Our browsers don't 
protect us from evil.

Still, you seem to be allowing for the possibility that sufficient useful 
data could help you reconsider. That is helpful.

> > Or perhaps I could use profile="http://www.at-enabled.org" (fictional)
> > to specify that I am promising to provide useful metadata. So it is
> > possible to place a seal of approval on a document.
>
>If we had such a seal, it would be used by lots of people who didn't
>actually do the right thing. So it wouldn't actually tell you the quality
>of the aforementioned attributes.

Sure, but if the AT community, including implementors, content providers 
and consumers were wiling to accept that risk, then could you get behind 
such an idea? I am not making a concrete proposal, I am just testing the 
waters.


>For example, the HTML4 DOCTYPEs are used by people who don't follow the
>relevant DTDs an order of magnitude more often than by people who _do_
>follow the relevant DTDs.


That doesn't mean that I, and millions of others, can't effectively use the 
HTML4 DOCTYPE profitably. It just means that I might have a hard time of it 
if were processing other people's data.



> > > > That may not seem like a very satisfying engineering solution, and
> > > > it isn't. But so what? If it only helps a few people to read a good
> > > > book or a newspaper or their company newsletter, then haven't we
> > > > made the world a better place.
> > >
> > > It's not that >99% of the relevant _users_ can't use these attributes
> > > and <1% of the relevant _users_ can use these attributes. It's that
> > > 100% of the users will find them useless >99% of the time, and they
> > > have no way to know ahead of time which the <1% of the cases are, and
> > > therefore they will act as if the attribute is useless 100% of the
> > > time.
> >
> > No I won't. I will point at sites that I know I can read. I will be
> > disappointed that I can't read everything that my neighbour can read,
> > but my life will have improved, if only slightly. And that's just me.
>
>I want better than this. I want the blind user to be able to read all of
>the same data I can read.

Me too. With respect, I do not believe that such a goal is within your 
reach. If you have a plan to make the web accessible by default -- one that 
eschews education -- I would love to see it. I'm sure that we all would. In 
the meantime, the wheels of the AT machine turn very slowly because they 
have very little grease.

The consumers of AT need content. They are willing to accept that not all 
sites are accessible to them, just as I am willing to accept that sites 
that use Flash are inaccessible to me -- simply because I won't install 
Flash on this particular computer. I just put those sites on my block list. 
Yes, that means that I am missing a lot of information, but I'm OK with 
that. And I am sighted.


>I think we should be aiming for solutions that have a chance of improving
>the experience for AT users across more sites than the current solutions.

I think that that will be a function of more producers providing useful 
content. In particular, I am thinking about the millions of financial 
tables that are published by [Yahoo! Bloomberg MarketWatch] every day. Or 
pictures stored all over the Web. Wouldn't it improve everybody's life if 
you could see the long description of the original photo while viewing the 
thumbnail photos in a catalog. That would improve accessibility in an 
unexpected way.

>[...]



> > > > And at what cost? Some HTML attributes that most browsers will
> > > > ignore and some will support.
> > >
> > > The cost of these attributes is that people who _do_ want to help
> > > authors will spend time writing help text that will be ignored by many
> > > of their users. Instead of improving accessibility in ways that
> > > actually improve accessibility to many users, authors will think they
> > > have improved their site's accessibility while in fact having done
> > > little to truly help users.
> >
> > I don't agree with your conclusion. It is a logical leap that is
> > unfounded.
> >
> > As I have written, when a site publishes the fact that they are
> > employing AT attributes properly and the community discovers that it is
> > true, the users of that site will benefit. The fact that other sites do
> > not will not prevent me from reading a site that does.
>
>I don't know about you, but this doesn't describe how I browse the Web.
>I don't limit myself to a few sites that have a community that I belong to.

I understand, but you are not visually impaired. You are not representative 
of that audience.



>[...]
> > Today, I think that you are neglecting evidence, both technical and
> > social, as pertains to various parts of the HTML 5 specification. I keep
> > trying to figure out how to present the case in a new way so that you
> > can see what is so obvious to me and others, but I haven't figured it
> > out.
>
>I feel the same way in reverse. :-)
>
>
>I think what you describe would be fine. I don't really mind if it's
>within the caption or legend of the image or table that the information is
>provided; my point is just that it shouldn't be in attributes that are
>hidden to non-AT users.
Again, I think this is a red herring.

[...]
Received on Thursday, 2 July 2009 03:56:55 UTC