Re: HTML should not be a file format, but an output format

F. E. Potts (fepotts@fepco.com)
Sat, 22 Mar 1997 19:46:43 -0700


Date: Sat, 22 Mar 1997 19:46:43 -0700
From: fepotts@fepco.com (F. E. Potts)
Message-Id: <97Mar22.191436mst.18433@gw2.fepco.com>
To: BruceLeban@aol.com
Subject: Re: HTML should not be a file format, but an output format
Cc: www-html@w3.org, devnull@gnu.ai.mit.edu,

This is a kind of strange thread. :-)

HTML is not a good storage medium, because HTML is a dying markup
language (only it doesn't know it yet :-).

SGML is a good storage medium, because it is a stable standard that has
the capability to be converted into "the markup language of the
moment".  One markup language of the moment is HTML 3.2, but HTML is a
moving target and must be treated as such.

Proper HTML is generally written by hand because the available tools
rarely are capable of writing valid HTML (which is HTML written to a
specific DTD).  Writing valid HTML by hand is very simple, and for
those who are used to writing it, hardly slower than just writing plain
text.

SGML is today generally written using SGML-aware editors (such as
ArborText's fine Adept*Publisher, and SoftQuad's Author/Editor).  These
editors--unlike the HTML editors--write valid markup and therefore are
widely used.

For long-term storage of documents--whether in a file-system or a
database--SGML is the proper choice, for it will still be usable ten
years from now (or longer), and is non-proprietary.  SGML is steadily
being improved, but remains fully backward-compatible by design.

Once "Webmasters" have good XML tools available, and start using them
in their daily work, I seriously doubt that they will ever want to
return to HTML (which is painfully limited).  HTML will remain around
for a while because there are a lot of documents written in it that
folks are not going to want to bother converting, but the creation of
new documents in it will likely be restricted to non-professional users
(e.g., John and Jane Homepage).  The professionals will have long
abandoned it.

BTW: Paul wrote (in reference to http://www.akimbo.com/globetrotter/):
> Here's a hint as to the problem:
> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

That isn't the only problem with this, for even under 3.2 you get:
  nsgmls:<OSFD>0:19:62:E: an attribute value must be a literal unless it
  contains only name characters
  nsgmls:<OSFD>0:20:66:E: an attribute value must be a literal unless it
  contains only name characters
  nsgmls:<OSFD>0:21:65:E: an attribute value must be a literal unless it
  contains only name characters
  nsgmls:<OSFD>0:22:72:E: an attribute value must be a literal unless it
  contains only name characters
  nsgmls:<OSFD>0:23:65:E: an attribute value must be a literal unless it
  contains only name characters
  nsgmls:<OSFD>0:24:89:E: an attribute value must be a literal unless it
  contains only name characters

On Sat, 22 Mar 1997 18:59:40 -0700, Bruce Leban wrote:
> I don't want this thread to turn into defending/attacking
> Globetrotter, but I will make one more statement in our defense. The
> HTML produced by Globetrotter passes a more pragmatic test: it
> produces the desired results in every browser we've tested with.

That's a totally bogus argument--we are talking long-term storage here,
and only documents written to a standard will be able to survive.  And
that's the problem with HTML as written by most non-pros: it is tested
against a "browser" rather than against a DTD.

-fep

--
fepotts@fepco.com
http://www.fepco.com/