Re: More on <CAPTION> element etc from Sander Tekelenburg on 2007-06-21 (public-html@w3.org from June 2007)

From: Sander Tekelenburg <st@isoc.nl>
Date: Thu, 21 Jun 2007 19:42:53 +0200
To: public-html@w3.org
Message-Id: <p0624062bc2a06493e5b0@[192.168.0.102]>
At 14:07 +0100 UTC, on 2007-06-21, James Graham wrote:

> Joshue O Connor wrote:

[...]

>> for example the CAPTION element may be
>> at the end of the table so the user may have already explored the table
>> before they get to the caption.

What tool did you try this with? I would think that HPR, which appears to be
a true (talking) browser, may well in fact present such a caption before the
table. Of course most people seem to rely entirely on screen readers though.

> Seriously, non-visual UAs won't read out the caption before the table body,
> regardless of source order? I would expect <caption> to be the _first_ thing
> they would look at for a description of the table, irrespective of where it
> appears. Is there some reason this isn't done?

My impression is that the explanation is in the name, "screen reader". Indeed
you'd think a talking browser would act as you say, but it seems that it is a
relatively recent thing that screen readers look at the actual HTML
themselves. The technology seems to have started out as something that more
or less literally 'reads the screen' -- take what is sent to the screen and
convert that to speech or braille.

(Starting out with that approach probably made sense at the time, because any
other approach would mean to either have to learn to digest any given
document type yourself, or at least rely on the OS to provide a digestable
representation of it. Not to mention the UI for interacting with data. Taking
the final visual output as input probably was (still is?) the only option to
make everything that is available through sight available to other senses.
Better approaches would be more dependant on the OS and each individual
application making their functionality available through a decent API, in a
standard/coherent way. You'd have better results, but not across the entire
OS/all apps that users want to use because too many developers will not
bother to do the work.)

To provide a more intelligent presentation of content, such software can
receive some interpretation of the HTML from the host OS, which it in turn
gets from the GUA. I wouldn't want to have to be the wizard that's expected
to turn IE's output into a useful presentation of a Web page. (I believe
since Jaws 6 or 7, it can also take Firefox' output as input, which probably
makes the wizardry a bit easier.)

But to be able to provide a truly useful presentation, such a tool will of
course need to parse the HTML itself. As I understand it, that last approach
has only recently started to happen in screen readers.


Disclaimer: it's difficult to get reliable information on this topic. I don't
consider myself a specialist at all. But since we seem to not have
developers/designers of these types of tools aboard, I choose to share what
little and possibly not entirely correct information I have.

Perhaps the chairs could actively try to get the vendors[*] of non-visual web
browsers on the HTML WG? If they'd participate, it would become easier to
find out what problems they run into; why they take the approaches they take.
That should make these discussions a bit more concrete. (I don't understand
why they haven't found their own way here yet. Seems like the opportunity of
a lifetime for them.)


-- 
Sander Tekelenburg
The Web Repair Initiative: <http://webrepair.org/>
Received on Thursday, 21 June 2007 17:46:59 UTC