- From: David Woolley <david@djwhome.demon.co.uk>
- Date: Sun, 29 Aug 1999 19:08:14 +0100 (BST)
- To: cynthia.waddell@ci.sj.ca.us (Waddell Cynthia)
- Cc: lynx-dev@sig.net, w3c-wai-ig@w3.org
Apologies for this size of this; while not directly about Lynx it does have a lot about the world in which Lynx has to survive. The original subject relates to the need to forge the browser identity before some sites will talk to Lynx. > Clinton and is found at http://www.aasa.dshs.wa.gov/access/waddell.htm. I note that both your email (more details in private correspondence) and the web document are bloated, causing accessibility problems for people for whom bandwidth costs money and that the email uses deprecated HTML and the web document uses undefined Unicode characters, e.g.: > </span>As suggested by O’Reilly, an area for future study is “where ^^^^^^ ^^^^^^ An accessible document, in English, should restrict itself to ASCII (  to ~) as far as possible, and failing that to the ISO 8859/1 subset of Unicode, which also includes   to ÿ. Anything in the range  to Ÿ is illegal in HTML, when entered as an entity, although may be used in a particular transfer character set to represent a Unicode character outside of that range, but such use will cause accessibility problems, itself. The correct Unicode characers for what was intended here probably wouldn't be recognized by Netscape. Also the prologue to the HTML looks as though it uses a lot of features that are specific to a proprietory word processor and appears to use HTML comments for what should presumably either be SGML processing instructions, SGML conditional sections or some extension to HMTL. <!--[if VML]><![if !VMLRender]><object id=VMLRender classid="CLSID:10072CEC-8CC1 ^^ The rest may be valid SGML, however I suspect a valid SGML parser would treat the rest as comments. It doesn't have an SGML doctype line and is not HTML 2.0, so it is invalid HTML. It looks like it is really some form of XML aware extension to HTML 4.0, but it doesn't have an XML processing instruction line either. The heavy use of styles suggest that the intended effect is page description, rather than content. It uses a proprietory character set: > <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> and attempts to set in a way that requires HTML 4.0, and is then only tolerated, not encouraged (I don't want to go online to check whether the server sent this in the proper place, or worse, sent conflicting information). Incidentally, this does not permit the use of ’ as entities are interpreted in the canonical, Unicode character coding, not in the transfer coding. Every paragraph seems to have a class based on the default MS-Word classes and there seems to have been no attempt to define classes to represent the deep structure. Even neutral body text has a class on each paragraph, when the sensible thing would be to have no class, or to use DIV to set a class for the whole run of normal paragraphs. This looks like something very close to a literal translation of the MS-Word internal coding, in the same way that MS RTF is. There is not a single <li> element in the whole document, even though the visual appearance of lists has been simulated by the use of runs of non-breaking spaces. E.g. this monstrosity (<B7> indicate the character with that code, not an HTML tag): > <p class="MsoNormal" > style="margin-left:.25in;text-indent:-.25in;mso-list:l35 > level1 lfo34; tab-stops:list .25in"><![if !supportLists]><span > style="font-size:16.0pt; font-family:Symbol"><B7><span style="font:7.0pt > "Times New Roman""> > </span></span><![endif]><span style="font-size:16.0pt">Removes age-related > barriers to participation in society<o:p/></span></p> Should have been: > <li>Removes age-related barriers to participation in society (possibly with a style for LI in the style sheet, and just possibly with a class on the LI). Inicidentally, the paragraph preceding this example mis-renders on Lynx, because it can't see the styling, but would have rendered correctly if the shorter, semantically accurate, alternative had been used: > · Removes communications and information access barriers that > restrict business and social interactions between people with and > without disabilities as against: > * Removes communications and information access barriers that > restrict business and social interactions between people with and > without disabilities (NB, whilst proper identification of list structure is potentially a very strong accessibility feature, authors actual accessibility aids (assistive technology in your jargon) may have expended most of their effort in working round the sort of HTML in your web page and not taken advantage of being given the structure on a plate.) Also, in the above, it almost certainly also uses fonts to misrepresent a character to obtain a visual effect, namely the shape of the bullet. Any selection of the MS Symbol font in HTML is extremely suspect in accessible HTML - I don't know of any common browser that handles it properly, i.e. only use Symbol if the selected Unicode character exists in symbol, and translate from the Unicode character to the Symbol code point for the same character, rather than using the numeric value of the character to directly index using the proprietory coding vector for the font. (In this case, the actual character specified, a full stop sized centre dot, doesn't look too bad.) Also, I note that Front Page has been used at some stage. Even Front Page 98 can easily generate illegal HTML, e.g. <b>...<p>...</b>, and I think your Front Page 3. is worse. You break one of the accessibility guidelines by not making your anchors describe the link - anchors in the body are just reference numbers, and anchors in the references are the URL. I think the former is due to the authoring tool not understanding the medium and simply using automatic footnote numbering. (Actually Lynx will do this numbering itself, so I end up with two numbers for each reference!). > Hit Counter It's no use for me to know this, and I won't have been counted anyway, as the counting is a byproduct of fetching the image of the counter! This should have alt="". (Generally this sort of hit counter, and hit counting technology in general, increase bandwidth, and are undesirable where bandwidth costs are an issue; they force an end to end access, reducing the effectiveness of local caching.) Taking content, now. > entity's purchasing choices create. Whenever existing technology is > "upgraded" by a new technology feature, it is important to ensure that > the new technology either improves accessibility or is compatible with > existing assistive computer technology.[40][38] For example, > web-authoring software programs that erect barriers in their coding of > web pages fall under this scrutiny. This assumes that there is a real market, but the reality is nearly every business is standardising on the current (a moving target) generation of MS Office, and the problems I point out in your web page indicate that you have fallen into the same trap (the version used indicates a very recent purchase). There are two approaches to producing "accessible" (to other businesses) documents in the business world: lowest common denominator or following de facto standards - nearly everyone goes for the latter. Most businesses don't want to incur any costs in complying with legislation that doesn't benefit their bottom line, so, if they have no choice because there is a monopoly, they are quite happy, and the suppliers can get on with supplying what the buyers think they really want. > Although America Online is one of the most popular services for accessing > the Internet, it is currently not a good choice for consumers who are > blind. Expert screen reader users find it extremely difficult to navigate > because of its nonstandard controls (buttons and icons) and lack of keyboard > commands. What the paper fails to consider is that these features, and many other accessibility problems, exist because of conscious decisions that they should exist, in order to make the products attractive to the primary market, to establish a house style, and for portal sites, to force people to search the page so they are more likely to read the adverts. The primary purpose of many web sites is not to provide information to consumers but to provide readers to advertisers, and data to market researchers. If you don't understand that accessiblity is often in conflict with the perceptions of commercially good web site design, and in the case of advertising sites may break the effect by revealling the paucity of information actually being supplied, you can't sensibly come up with ways that site designers will identify with. Go to your local book shop and you will find that nearly all the books on HTML are telling people all the tricks that result in inaccessible pages! You have to go to the source documents, like the HTML 4.0 specification, which most authors consider too technical, to get a different picture. > part of her work in advising City Council because the documents were > posted in an inaccessible format, ie. portable document format (pdf). You have to consider that the alternative is often a proprietory word processor format, or a very messy attempt by that word-processor to produce plain text, or, for newer documents, the sort of semantically broken HTML in your web page. Actually one of the design goals of PDF is to allow extraction of the text, and this is quite possible if output is designed for PDF creation, but can be rather more difficult when word processors insist on positioning almost every character individually. E.g. if I use PDF created from an nroff source file (which itself is directly readable) and run the ghostscript ps2ascii utility on it, nearly every word survives intact, although there may be a few spaces in the middle of words. I would expect Acrobat to do even better. Doing the same on an MS-Word document will stress Acrobat to the limit. > installed base. Microsoft has been accused of trying to extend both > Java and HTML in proprietary directions.[92][89] Both Microsoft and Netscape have done this; in fact it could be argued that most of your web accessibility problems stem from Netscape's pandering to commercial *wants*, which was probably the basis of the business plan that turned a group of NCSA people into the Netscape company. It seems to me that W3C has been fighting a rearguard action to try and regain as much as possible of the original spirit of academic freedom of information. > Perhaps research is needed on how to best manage open standards where > civil right protections are afforded the community of people with > disabilities in their access to technology. As suggested by O'Reilly, At the core of this is the question of the role of the state in regulating industry; something, I believe, of a hot potato in US politics. If you are serious about writing accessible documents, start with Notepad, not MS-Word, write everything in plain text, then add HTML structures to reflect the structure of the document. Only then start considering appearance. That way you will think about the contents, then its structure, which are the things that need to be preserved if you want good accessibility. If you can't read the HTML as plain text, you have probably got it wrong for an accessible document. The two worst things for accessibility are authoring tools and cut and pasting from sites which "look good" without understanding the structure. If you want to make the web page accessible, you have to decide whether you want it to be accessible in its current visual form, in which case, in spite of it's negative position on PDF, PDF is the appropriate choice, or whether you want the content to be accessible, in which case you should save it as text and edit in the HTML markup by hand. One final point to remember, is that most web site are not about communicating information; they are about selling, and these days the two are almost incompatible. (It's an interesting point that, in spite of the accessibility lobby's objections to PDF, PDF is where you will find the real information on most commercial sites, and the HTML on the sites is used where PDF would have been a better vehicle, because those parts are all form and no content!) [ Note I am only subscribed to the lynx list, not the accessibility list, and, if the latter is closed, this may fail in that list. ]
Received on Sunday, 29 August 1999 18:03:15 UTC