Data collection and analysis (was RE: 48-Hour Consensus Call: InstateLongdesc CP Update) from John Foliot on 2012-09-19 (public-html-a11y@w3.org from September 2012)

From: John Foliot <john@foliot.ca>
Date: Wed, 19 Sep 2012 12:11:13 -0700
To: "'Maciej Stachowiak'" <mjs@apple.com>, "'Benjamin Hawkes-Lewis'" <bhawkeslewis@googlemail.com>
Cc: "'Joshue O Connor'" <joshue.oconnor@cfit.ie>, "'Leif Halvard Silli'" <xn--mlform-iua@xn--mlform-iua.no>, "'Silvia Pfeiffer'" <silviapfeiffer1@gmail.com>, "'Steve Faulkner'" <faulkner.steve@gmail.com>, "'Sam Ruby'" <rubys@intertwingly.net>, <public-html-a11y@w3.org>
Message-ID: <012101cd969a$8b634080$a229c180$@ca>

Maciej Stachowiak wrote:
> 
> Some browser vendors (including Apple) have the ability to gather data
> on real-world usage as actually observed by users. Generally for
> privacy considerations we cannot log individual URLs. But we could log
> data such as:
> 
> - What proportion of images have a longdesc attribute
> - What proportion of those images have obviously wrong longdesc URLs
> (empty, #, appears to be an image, top-level URL of a domain, url of
> the same page that contains the image, etc)
> 
> Would folks see such data as more credible? It would be significant
> effort and we could not reveal the raw numbers. I suspect many would
> reject such data as not publicly reproducible.

Hi Maciej,

A few thoughts:

I can appreciate the concerns over privacy and security that might
necessitate not being able to make raw numbers publicly available. The lack
of independent review however does tend to weaken the returned results. That
said, any research data that adds to the collective knowledge base,
especially when it comes to matters of web accessibility, has a value to the
web community. It depends in part on how that data is used.

Another thing that I would like us to focus on is the fact that in certain
vertical markets (education for example) we may indeed find a greater
percentage of accurate and useful @longdesc content - remember, entities
such as Pearson Publishing have already shown us that they are using
@longdesc today as they migrate their educational materials online. As Geoff
Freed also pointed out, there is a US Federally funded project underway
right now that is focused on both improving the quality of longer textual
descriptions, as well as seeking to add to the collective "bucket" of this
type of content on the web. There is no doubt that there is some lousy
historical content out there, but I think we could all safely conclude that
there are concrete and fruitful efforts underway to improve that.

For these reasons, while I personally don't want to totally dismiss the
past, I also want to focus more clearly on the future, both in the short
term as well as in a longer term. 

It is for those reasons that I am most interested in a managed path forward
as opposed to a lot of hand-wringing about how bad things used to be: once
upon a time, before we could reliably use CSS for page layouts, the web was
"polluted" with nested tables 8 layers deep that were nightmarish for screen
reader users to deal with (of which I am sure we could find numerous - and
sadly sometimes still current - examples of in the wild as well). The
solution then was not to obsolete tables, but to provide a better path
forward, and coupled with education and increased browser support, teach the
web community the better way. (I'll note that it took approximately 3 to 5
years to turn that tide too.) If the "problem" of longer textual
descriptions is approached in that fashion, I think we stand a very good
chance of making the historical data just that, history.

JF

Received on Wednesday, 19 September 2012 19:11:57 UTC