Re: CheckHtmlEsis

Dear "Russell Steven Shawn O'Connor" and Peter Flynn,

The comment that some kinds of validation should be done *only* by the
browser doesn't make sense to me. It seems to me that a web author would
like to know if his document is invalid in this or any other way, so he can
fix it.

Here are a few things which I wish my validation tools would check:

Once I forgot to put the terminating quote on a URI inside a <a></a>
entity. Since ">" seems to be a valid character inside a string, ... my
validation tools gave me error messages, but they were misleading. It took
me a while to figure out the real problem.

I once had a bunch of URIs similar to <a href="www.ti.com">TI</a>, which
the DTD would accept. My link check software kept telling me that this was
a bad link, but the URI seemed to work fine when I manually typed it into
my web browser ... color me confused. I wish I had gotten some warning that
would suggest "I think you meant to say http://www.ti.com/ ".

I wish my validators would warn me when "You forgot to put a 'alt'
attribute inside this <img> tag". (same for the height and width
attributes).

Many people intend to make *every* graphic a link, so they would appreciate
a program that listed which <img> tags were not wrapped in a <a></a> tag.

Even though the "&lt" is apparently legal SGML, I intend to always use the
full "&lt;" and would like some warning when I slip up.

I intend to wrap every URI in the source text with a link to that URI. I
would like a validator to check that every string (outside of a tag) of the
form "http:" or "ftp:" or "mailto:" (what others are there now ?) is not
merely inside a <a></a> entity, but that the href attribute is actually set
to the *same* location (rather than some other unrelated location).

I don't think my tools are smart enough to check that (a) for every <a
href="#misc">misc</a> there is one and only one <a name="misc">misc</a> in
the document, and (b) that for each <a name="misc">misc</a> there is at
least one <a href="#misc">misc</a>. When I add a new section to a page,
something like (b) would remind me to add that section to the table of
contents I keep at the top of the page.

In my opinion, *every* web page needs to have a email address somewhere on
it, so people viewing it can respond to any questions the author raises.

I'm sure there are many other little things that a machine could easily
check, but that current validators do not check.

-- David Cary

>From: "Russell Steven Shawn O'Connor" <roconnor@wronski.math.uwaterloo.ca>
>Reply-To: roconnor@uwaterloo.ca
>To: www-html@w3.org
>Subject: Re: CheckHtmlEsis
...
>On 22 Apr 1998, Peter Flynn wrote:
...
>> nsgmls always validates all attributes. I don't see how there
>> is anything else to write.
>>
>> Unless you mean _semantic_ validation...but that's outside the scope
>> of HTML, it belongs to the browser.
>
>Section 19.1 of the HTML 4.0 specs explains nicely
...
>My program is such a specialized program.  It doesn't capture the complete
>specification of HTML 4.0, but it captures more than the DTD alone can
>give.
>
>--
>Russell O'Connor                           roconnor@uwaterloo.ca
>    <URL:http://www.undergrad.math.uwaterloo.ca/%7Eroconnor/>

--
+ David Cary "mailto:d.cary@ieee.org" "http://www.rdrop.com/~cary/"
| Future Tech, Unknowns, PCMCIA, digital hologram, <*> O-

Received on Wednesday, 22 April 1998 20:29:46 UTC