Re: HTML 3.2 PR

Carl Morris (
Fri, 15 Nov 1996 21:13:21 -0600

Message-Id: <>
From: "Carl Morris" <>
To: "Harold A. Driscoll" <>,
Subject: Re: HTML 3.2 PR
Date: Fri, 15 Nov 1996 21:13:21 -0600

| ><! -- this is a comment in SGML -- >
| >
| >Why?  An HTML parser has no business trying to figure out SGML, an
| >parser being used on an HTML document on the other hand may have use
| >for the declaration.  It needs to be described this way.  
| Quite to the contrary. Since HTML is (by definition, see RFC 1866 in
| Abstract) an application of SGML, SGML parsers are quite appropriate
| parsers.

No, an SGML parser is an SGML parser, an HTML parser is one like I am
writing, it can't process SGML.  HTML is an application of SGML,
meaning that HTML works on the SGML operating system, if you don't mind
the metaphor, but may be ported, and even runs on its own.  Requiring
SGML will only complicate matters for both Authors and for Developers.

Can I say one thing?  You are not going to educate the entire world
about SGML.  Stop trying.

| Actually, calling <! a comment open symbol is what is misleading,
since it
| is incorrect.

I wrote an additional letter stating it would be wise to address both
developers and authors differently in this matter.  To a developer that
is not going to process SGML, he better stop after the <! and wait for
a >, because the part of trying to find the -- will only complicate

| Please, to keep this discussion on track, it would be very helpful to
| references for facts, and to clearly identify opinions which differ
from the
| facts as such.

To clearly define "SGML Parser" all I can say is "A Parser which can
meaningfully process SGML declarations and tags" where an HTML parser
simply "lacks any SGML declaration knowledge or use".

|    To include comments in an HTML document, use a comment
declaration. A
|    comment declaration consists of `<!' followed by zero or more
|    comments followed by `>'. Each comment starts with `--' and
|    all text up to and including the next occurrence of `--'. In a
|    comment declaration, white space is allowed after each comment,
|    not before the first comment.  The entire comment declaration is
|    ignored.

Yea, lets look at this.  An HTML comment begins with a <! and --, but
it never said what happens to the stuff between.  Now an HTML parser
could be wrote to flag anything between as an ERROR.  Is that what you
want?  No, you want an HTML parser to simply ignore the possible SGML
declarations it knows nothing about.  You can't tell people about this
without telling them all about SGML.

|       NOTE - Some historical HTML implementations incorrectly
|       any `>' character to be the termination of a comment.

And tell me why they can't?  A reserved character SHOULD NEVER be used
in such a system.  If < denotes the start and > denotes the end, then a
> inside such constructs can not be used.  I don't know what in SGML
has allowed such constructs, but I do know protocols, and this is a
large error in a protocol, one that makes the protocol buggy and
usually kills the protocol in a short matter of time.  Its what
escaping was designed for, and all successful protocols use it.

| [2] Consider the historical case of distinguishing between the number
| and the letter O. For many years, engineers have used a slash through
| number 0. This convention was implemented in a number of hardware
| including various IBM printers during the mainframe era. Apparently
| of this convention, an minister turned computer programming book
| noted the confusion and introduced the convention of using a slash
| the letter O. As you can imagine, while each usage was rather
| things got very confusing when programmers (using slash with letter
O) got
| their printouts (using slash with number 0).

And so to each their own.  Thats what happens when a company with the
name IBM adopts something, A lot of people refuse to use "GUI"'s
because of Microsoft's claim on Windows...  Does it surprise you than
an author would purposely do something like that?  It doesn't to me.