- From: olivier Thereaux <ot@w3.org>
- Date: Tue, 20 Sep 2005 10:54:46 +0900
- To: mam@theory.Stanford.EDU
- Cc: www-validator@w3.org
Dear Mr. Knuth (or, by proxy, dear Ms. McLoughlin), Thank you for your feedback on the Markup Validator. As you noted in your message, this tool is managed by "system people who are supposedly committed to helping the world's users from all the various cultures". Indeed, the development and maintenance of this validator is done mostly by a group of volunteer developers, with the help of a user community here on the www-validator. This community is working very hard to make this tool as good as possible, and is dedicated to make the service useful and helpful for people around the world. The validator checks documents against the document type they claim to be using. In an overwhelming number of cases, the document type is a well-known standard, and the validator, which has a library (a cache, so to speak) of all the formal public identifiers (e.g "-// W3C//DTD HTML 4.01//EN") which allows for speedy validation without needing to fetch the actual DTD. That library had, however, become bloated and hard to maintain, which made it a liability. The community behind the validator decided to strike a balance by only keeping standardized document types in this library. We did not, at the time, communicate much about this decision, and as the person responsible for communication around this project, I genuinely apologize for this. We also made sure the validator would have proper support for other doctype constructs, so that users of proprietary document types could declare: <!DOCTYPE html PUBLIC "-//MyOrg//MyDoc//EN" "http://www.example.org/ mydoc.dtd"> or <!DOCTYPE html SYSTEM "http://www.example.org/mydoc.dtd"> which are the recommended ways of declaring the DOCTYPE when using a non standard DTD. Using a non standard DTD with only: <!DOCTYPE html PUBLIC "-//MyOrg//MyDoc//EN"> is technically acceptable but dangerous, since there is no guarantee parsing agents will know the public identifier. This is the case your documents are falling into, as you are using: <!DOCTYPE HTML PUBLIC "-//Netscape Comm. Corp.//DTD HTML//EN"> Ideally, you would not be using this document type. It is proprietary and has never been standardized. As a matter of fact, as a long time user of the validator, even before it was maintained at W3C, you probably read the following: [[ > However, please be aware that this DTD contains many elements which > may never become standardized or widely supported. ]] -- from the documentation of the "kinder, gentler HTML validator" How can we solve this problem? Ideally, this would be an opportunity for you to switch your content to an actually standard HTML version. Unlike the never-published (at least I cannot find any published version of the DTD), never standardized Netscape HTML, languages such as HTML 4.01 have been through a standardizing process. That means that they have been designed with concern for the needs of all, and that means that they are here to stay. How hard would that be? In the case of your documents, that's a matter of three steps: 1- change the doctype declaration at the top of each documents to e.g: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 2- get rid of the absmiddle attribute. This attribute value is not in standard HTML. But HTML 4.01 has valign="middle" which I gather has the exact same effect. Better yet would be using CSS[1]: using HTML as a markup-only language and using Cascading Style Sheets for the style of your document is not only cleaner, easier to maintain, it is also lighter and saves bandwidth. [1] http://www.w3.org/TR/CSS2/ 3- alt attributes for images. There is a good reason why the standardized HTML requires alt attributes for images where the proprietary netscape HTML does not: accessibility. A person visiting your Web site with a screen reader, for instance, would not be able to know what the images are. Actually, your documents already use such information, albeit not consistently. Going through your content and adding descriptions for images with meaning, and then running the tool tidy [2] with the alt-text option could take care of all the images that are purely presentational and set an empty alt text for them. [2] tidy.sourceforge.net/ Except for the setting of alternate text for the images, which would be a favor to the quality of your content anyway, all these operations can be automated in the matter of a few lines of code in whichever text-processing language you fancy. This should not take a week. Alternatively, if you are really willing to keep using the nonstandard DTD, the validator community, which cares a lot about the quality this service offers, could consider re-adding it in the validator's catalog for a future release of the tool. I honestly do not believe that this would be a winning situation for anyone. You say you have been a long time user of the validator, and I am hopeful that this means you care about your content being correctly written in a properly defined language. If that is the case, then I trust you will see the interest in switching your Web site to a standard language such as HTML 4.01. Regards, olivier -- olivier Thereaux - W3C - http://www.w3.org/People/olivier/ W3C Open Source Software: http://www.w3.org/Status
Received on Tuesday, 20 September 2005 01:55:03 UTC