Re: PR-HTML40

Jukka Korpela (jkorpela@cc.hut.fi)
Thu, 13 Nov 1997 09:18:12 +0200 (EET)


Date: Thu, 13 Nov 1997 09:18:12 +0200 (EET)
From: Jukka Korpela <jkorpela@cc.hut.fi>
To: www-html@w3.org
In-Reply-To: <Pine.LNX.3.96.971112134129.31791A-100000@ns.viet.net>
Message-ID: <Pine.OSF.3.96.971113083806.14939A-100000@beta.hut.fi>
Subject: Re: PR-HTML40

On Wed, 12 Nov 1997, Benjamin Franz wrote:

> In hand created HTML, losing track of nesting levels is a real
> problem and the source of much aggravation. Enforcing closing really aids
> the HTML writer. <understatement>It is also clear that browser writers
> have problems handling implictly closed elements.</understatment>

I am getting more and more convinced that all end tags should be
explicit. (And modifying a DTD at least for one's private use to
reflect that might be very useful in validation.) The above-cited
reasons are good _practical_ reasons. To them, I'd like to add that
so-called popular browsers pretty often display documents _differently_
depending on the presence or absence of an "omissible" end tag like </P>.

From the more theoretical, or conceptual, side, omission of end tags
enforces people's tendency to think in terms of tags, not elements.
We have seen this to take place very often - there was just a long
thread about the essence of the P element in c.i.w.a.h. Even knowledgeable
people easily get confused. How many authors know exactly which elements
implicitly terminate an open P element? To rephrase this, how many authors
know exactly where their paragraphs actually end? By writing the four
little characters </P> one makes this clear, to oneself too, and gets
an error message from a validator if one hasn't realized that the
paragraph actually was implicitly terminated - perhaps against author's
intentions - previously.

> The ALT text problem is a *solvable* problem in browser implementation
> where the layout hangup caused by no HEIGHT or WIDTH attributes is not. A
> smart browser (like say, Spyglass Mosaic) can do intelligent things with
> ALT text when images are turned off if the programmer is thinking beyond
> the graphics. 

I'm not sure it's solvable even in principle. Remember that the
fundamental raison d'etre of an ALT text is to provide a _replacement_
for an image (when the image is not displayed, for one reason or another).
Now if the IMG element has HEIGHT and WIDTH values, how could the browser
know whether it is more important to display the ALT text or to allocate
space according to the HEIGHT and WIDTH values? After all, images are
often used for getting a specific layout. I don't favor that, but how
could the browser know whether the ALT text is actually meaningless,
as it often is, so that omitting it causes less damage than deviating
from the desired dimensions?

> But if the HEIGHT and WIDTH are not there, there is nothing to be done
> except either wait for the images before doing layout (Netscape's
> approach) or re-laying out the page repeatedly as the images come down
> (MSIE's approach). Both cause real headaches to the end user by making the
> user potentially wait a VERY long time before seeing *anything* or by
> causing the page to move repeated while they are reading it. 

That is a very good description of the reasons why those attributes
were introduced to HTML. On the other hand, this means, in particular,
that if the image dimensions are changed, one should remember to check
the HEIGHT and WIDTH attributes too (otherwise one risks serious
distortion of the image due to eventual scaling by a browser).
And actually, one should check the stylesheets too. This creates
unnecessary dependencies, comparable to scattering several copies
of a manifest literal (instead of a named constant) throughout
a program. And as realistist know, the phrase "one should remember to
check" actually means "you'll goof it up".

So what can we do? I tend to think in terms of good authoring habits here.
Authors should - in any case, even if this problem didn't exist - try
to give the reader something to start with, while waiting for the rest
of the material to be loaded and formatted. This means that one should not
begin with large images or objects or tables but with some adequately
marked-up texts. Such principles cannot be described in a language
specification (especially not in a DTD), and one might have good reasons
to deviate from them even if it implies what you describe in terms of
use discomfort. So it seems to me that a language specification should
_allow_ HEIGHT and WIDTH but not require them, rather discourage;
whether they should be deprecated in favor of style sheets is debatable.

Yucca, http://www.hut.fi/u/jkorpela/