W3C home > Mailing lists > Public > public-html@w3.org > May 2008

RE: Machine checkability (Was: Re: HTML Action Item 54 - ...draft text for HTML 5 spec to require producers/authors to include @alt on img elements.)

From: Justin James <j_james@mindspring.com>
Date: Mon, 12 May 2008 01:37:44 -0400
To: "'Philip Taylor'" <pjt47@cam.ac.uk>
Cc: <public-html@w3.org>
Message-ID: <03f901c8b3f2$4e15acc0$ea410640$@com>

> A human can check the non-machine-checkable requirements. An author who 
> wants to write high quality HTML can post their code to a friend or a 
> mailing list or a message board, and someone who knows the spec well can 
> reply saying "you mustn't use <b> for these headings" and "you mustn't 
> use an empty alt attribute on this button image" and point to the parts 
> of the spec that require those things.

It is quite clear that only a small fraction of people generating HTML care about this. Most people are using tools that they have (incorrectly) assumed do things the way they need to be done. Why is it, 15 years after HTML hit the scene, no one interested in making valid HTML that is also good, clean HTML uses a WYSIWYG system? Because the tools stink, and we all know it. This is an unacceptable state of affairs. If email worked this way, the computer revolution would have never happened.

> We just need enough people to care about producing high-quality HTML so 
> that the benefit to them (and their users) of the thorough non-machine 
> validation is enough to justify the cost of us developing those 
> non-machine-checkable requirements. Given the total number of HTML 
> authors, the tiny fraction that cares about quality is still a large 
> number of people, so this seems worthwhile. The requirements aren't 
> useless just because they will be almost universally ignored.

When you discussed determinism and such (omitted in my reply here), I agreed at many of the points. Here is where you are breaking down. If you want people caring about making valid HTML, you will need to forbid the ability to generate HTML to an elite few. The HTML spec, as is, is currently far too complex for any single author to get it right 100% of the time unless the documents they generate are extraordinarily simple. The people you really need to get on board are the HTML generation tool creators. Henri is having a bear of a time writing a validator, imagine trying to write an HTML authoring tool that tries to be valid? It's because our spec stinks. It has far too much gold plating, far too many tags, and far too many things that CANNOT be expressed in the logical constructs available to modern general purpose programming languages. Until the spec can be EASILY, fully and accurately be expressed in C/C++, Java, C#, VB.Net, Perl, PHP, Ruby, ECMA/JavaScript, Python, and a few other languages, you will not see more than a small fraction of HTML documents being valid, let along valid in a way that is also semantically correct.

J.Ja
Received on Monday, 12 May 2008 05:38:30 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:54 UTC