Structure vs. appearance in HTML

Stavros Macrakis (macrakis@osf.org)
Thu, 21 Sep 1995 12:32:44 -0400


Message-Id: <199509211632.MAA06408@postman.osf.org>
To: www-html@w3.org
Subject: Structure vs. appearance in HTML
Date: Thu, 21 Sep 1995 12:32:44 -0400
From: Stavros Macrakis <macrakis@osf.org>

Brandon Plewe <PLEWE@plewe.cit.buffalo.edu> says in
    <950721041622.57@plewe.cit.buffalo.edu>:

    ...people don't care about rational structure as much as they do
    about immediate results....

Most users have no idea what structure there is.  What they know is
what they can see and do.

    I have spent years trying to explain to novices the value of
    structure-encoding as opposed to presentation-encoding, and the
    truth of the matter is: other than a few very advanced
    applications, most people out there just don't care.

I agree that structure is important, but only if you can do something
with it.  So far, I am not aware of any Web tools that actually DO
anything useful/interesting/amusing with HTML structure.  (Well, OK,
some not-very-well-known browsers do do holophrasting.)

For that matter, HTML doesn't really _have_ that much usable
structure.  Here are some examples:

-- Only in 3.0 do we get hierarchical structure via DIV.

-- The math operators are defined in terms of rendering ("...close in
   spirit to the representation used in LaTeX and TeX, and is being
   designed with regard to the ability to render HTML Math to speech
   as well as to graphical and textual displays") and not mathematical
   semantics.  This will make it clumsy to cut and paste formulae into
   your favorite math software.  For example, the differential "dx" is
   apparently indistinguishable from the product of variable d and
   variable x.  (I say apparently because the spec is incomplete.)  On
   the other hand, it is true that there are things you might want to
   display which don't make sense to math software (e.g. ellipses in
   certain cases).

-- In fact, the math spec is very highly appearance-oriented: "HTML
   math doesn't provide direct support for multi-line equations, as
   this can be effectively handled by combining math with the TABLE
   element."  So how is my renderer supposed to resize formulae as a
   function of screen width if they're split into multiple table
   entries?!
 
-- The ADDRESS element, which might seem to have useful semantics, has
   no internal structure.  Wouldn't it be nice if a tool could extract
   an name and e-mail address and phone number from the address?

In the final analysis, it is not clear how much useful structure you
can provide in a simple, general-purpose DTD like HTML.  Something
like the TEI DTD has lots of useful structure, but it is a very big
DTD, and still doesn't cover a lot of important areas.

    If the HTML standard wants to win out in the end, there is only one
    answer: we have to ***show*** everybody the virtues of TrueHTML, not
    just explain them.  There *must* be a top-notch browser that is truly
    committed to the HTML standard....

That's not enough.  There have got to be tools that actually exploit
whatever structure there is in HTML.  _That_ is the virtue of
"TrueHTML".

On the other hand, there are many providers who DO NOT want to provide
structural information.  Consider in particular a tool that could
strip out ads automatically....

	-s