An elaboration of why plain text is still needed as an alternate format from Jamal Mazrui on 1998-12-08 (w3c-wai-ig@w3.org from October to December 1998)

From: Jamal Mazrui <empower@smart.net>
Date: Tue, 08 Dec 1998 09:16:35 +0400
To: <w3c-wai-ig@w3.org>
Message-Id: <199812081316.IAA15162@gemini.smart.net>

There is not yet a Windows based HTML browser that provides
efficient access to most screen readers.  This is not true, for
example, of the latest versions of Netscape Navigator, Internet
Explorer, and Opera.  After Microsoft releases service pack 2 for
IE4, this browser, combined with Microsoft Active Accessibility
1.2, has the potential of providing efficient access.  Until that
happens and users have validated it through experience, however, it
is a promise and not a reality.  IBM and Productivity Works also have 
good accessible browsers in the making, but they are commercial and 
also not yet proven to the disability community.

Even when there is an accessible HTML browser on the dominant,
Windows platform, a large proportion of users with and without
disabilities will not be able to take advantage of it because they
will not have the hardware necessary to comfortably run this
software.  A comfortable experience requires a minimum of 32 megs
of RAM and a clock speed of 200.  People with disabilities are
often economically poor and not able to keep up with the latest
hardware requirements.  People from developing countries also face
this situation, whether they have disabilities or not.

It has been suggested that, even if one does not use a contemporary
graphical browser, it is a trivial matter to use one to produce a
good plain text rendering of the HTML page.  Unfortunately, I have
not found this to be the case.  An initial problem is that the line
length produced will often be greater than 80 characters unless one
takes special steps to prevent this from happening.  On a standard
text terminal, some information is lost when scrolling down
continuously through the text.  If one has to navigate to the right
margin whenever some words appear to be missing, the reading
process becomes cumbersome, unpleasant, and inefficient.

A second problem in plain text rendering is that HTML pages usually
contain headers, footers, and embedded navigational references that
get included in the plain text rendering but are not material to
its content, and thus a distraction.  I generally have to go through
the initial plain text rendering to clean out such extraneous
material.  I've developed macros that help speed up this clean-up 
process, but it requires significant time and effort to edit each page in 
order to get crisp content.

I've tried every DOS command-line utility I could find that converts 
from HTML to text.  Only one, HTMSTRIP, (from 
http://www.geocities.com/SiliconValley/Lakes/2414)does a satisfactory 
job, in my opinion.  This further indicates that rendering good plain text is a 
nontrivial matter which should not be assumed practical for the 
average user.

Regards,
Jamal

Received on Tuesday, 8 December 1998 08:16:38 UTC