Re: rethinking the HTML DTD.

Tim Berners-Lee (timbl)
Wed, 15 Jul 92 00:03:56 MET DST

Date: Wed, 15 Jul 92 00:03:56 MET DST
From: timbl (Tim Berners-Lee)
Message-Id: <9207142203.AA02008@ >
Subject: Re: rethinking the HTML DTD.


You say HTML is not SGML.  It is true that the HTML generted by the NeXT editor
is not good. (example, lack of quotes around attributes which need them.)
Hwoever, the current parser wil parse real SGML. 

I feel it IS important to keep the higher-level markup.
Ypu ask, " Who really
uses all the "format independent" features of WWW? I haven't seen
anything that the RTF stylesheet features can't handle."

Well, the line-mode browser uses these features to generate a particular
style which is different from the Xwindows style.  The LaTeX generation
scripts which we use to make the "www book" use the high-level markup.

It is true that HTML does not have a deep structure, so that we can
be compatible with software whichcannot handle nested elements.
There is nothing wrong with having a simple SGML DTD as a basic case.
SGML does not HAVE to be complicated.  You can use SGML to map any
(non-overlapping) structure you like.

In the future, the web will inclued more complex DTDs, and dynamically
loaded DTDs, and people will want to use the same parser for it.

You suggest that we should use RTF because it is better supported.
Maybe we could use RTF in parallel in an experiment.  Soe problems
I have are that

RTF uses a fudge of specially names styles to represent headings
(for example, in Word) from which the WP deduces a structure
(for outline mode, etc).

RTF has styles, but as far as I could see Microsoft RTF documents
have teh actuall formatting information always tucked in there even
if it there is a style name attached.

RTF has various extensions fopr handling for example embedded documents
and links, but are these standard ized, or are different manufacturers
going to use different tagsets in RTF just like SGML?

Perhaps I am out of date in my knowledge of RTF (I certainly am).
However, I see the WP manufacturers trying to escape from a position
where they are historically bound to an RTF view, when they would like
to be able to handle SGML.

If you're talking about displaying things, to make HTML into RTF
is trivial.  You can make HTML into MIF too. You have to add
style information of course.  When you go back you have to do this
fudge of requiring the same style names to be used.

So I feel RTF would be a backward step. It is true that the current
W3 software is at a point level with RTF rather than general SGML.
But why tie ourselves to that point?