Re: HTML should not be a file format, but an output format

Paul Prescod (papresco@calum.csclub.uwaterloo.ca)
Sun, 23 Mar 1997 04:49:17 -0500


Message-ID: <3334FC9D.40E@csclub.uwaterloo.ca>
Date: Sun, 23 Mar 1997 04:49:17 -0500
From: Paul Prescod <papresco@calum.csclub.uwaterloo.ca>
To: BruceLeban@akimbo.com, www-html@w3.org
Subject: Re: HTML should not be a file format, but an output format

BruceLeban@akimbo.com wrote:
> Sorry it seemed that way. I was trying to argue against using HTML as an
> editing format. I also don't think users should edit XML or SGML. I think
> you should use tools that edit those for you.

That's fine. When can I expect an Akimbo product that allows me to do
so?
 
> HTML 3.2 was not a standard at the time that Globetrotter 1.1 was
> released. 2.0 was the closest standard available and there is no way to
> specify in the doctype that it's 2.0 + various extensions. We decided it
> was better to put an incorrect doctype than none at all. Maybe that was a
> mistake.

It was a mistake. Putting in no doctype would have also been a mistake.
The correct thing to do is to create your own doctype, and release the
DTD that goes with it. This is the approach your competitor SoftQuad has
taken. The Doctype is something like

<!DOCYTPE HTML PUBLIC "-//SoftQuad//DTD HTML 2.0 + Extensions//EN-"
"http://www.softquad.com/dtds/html-extended.dtd">

(something like that). And the DTD is on their website, in the right
place, so people can look it up, see what it does and does not support,
and validate against it.
 
> The missing quotes around %s are an oversight,
> but not what I would call a major problem. Much more of a problem IMHO is
> all those web pages out there that can't be read. Just recently I saw a
> page that was 110% of the width of the window. No matter how wide you
> made the window, it made the text run off the right edge so you couldn't
> read it without scrolling.

I guess it depends on your point of view. If you are reading HTML with
an SGML based screen reader because you are visually impaired, you
probably prefer correct HTML with all of the quotes in the right place
that happens to look rotten (but you can't see that!) than code that
causes your screen reader to dump 35 error messages. I'm not claiming I
am in this position, but there are people who are.

> I'm not convinced at this point that using XML as a storage format
> provides enough of a benefit to justify overhauling Globetrotter
> completely. 

If GlobeTrotter needs to be "overhaulled completely" to support a new
storage file format, something is very wrong. Word supports *dozens* --
including SGML if you are willing to pay extra for it. All you should
have to do is strap on a new parser and writer. This much easier, in
principle, than supporting a different target format as you described in
your first message. To support a new target format you must research
that new format and figure out mappings from your internal constructs to
the target format. For XML, you just spew out <CONSTRUCT> ...
</CONSTRUCT> instead of U(*&^^HJHU^*^J& (binary code).

> All the binary data in the file would still be unreadable.

Like what? Leave graphics in their original format and they will be
readable by the original graphics program.

> Furthermore, Globetrotter supports things like overlapping styles
> [...(...]...) that are not valid XML. 

The value of this feature is dubious, but it is easy as pie in XML 
<STYLE-START "Style1">Text<STYLE-START "Style2"> Text <STYLE-END
"Style1"><STYLE-END "STYLE2"> 

This ugly, but no more so than the generated HTML which must look really
terrible (unless you violate the HTML specification altogether).

It is unfortunate that Akimbo does not consider keeping their user's
documents in a standardized format a priority. That is typical of this
industry, however. One day user's needs will come first and the needs of
programmers second (I hope!). One day we may even be able to reliably
load a GlobeTrotter document into Word for Windows and vice versa
(imagine it!).

 Paul Prescod