Re: HTML 3.2 PR

| ><! -- this is a comment in SGML -- >
| >
| >Why?  An HTML parser has no business trying to figure out SGML, an
SGML
| >parser being used on an HTML document on the other hand may have use
| >for the declaration.  It needs to be described this way.  
| 
| Quite to the contrary. Since HTML is (by definition, see RFC 1866 in
the
| Abstract) an application of SGML, SGML parsers are quite appropriate
HTML
| parsers.

No, an SGML parser is an SGML parser, an HTML parser is one like I am
writing, it can't process SGML.  HTML is an application of SGML,
meaning that HTML works on the SGML operating system, if you don't mind
the metaphor, but may be ported, and even runs on its own.  Requiring
SGML will only complicate matters for both Authors and for Developers.

Can I say one thing?  You are not going to educate the entire world
about SGML.  Stop trying.

| Actually, calling <! a comment open symbol is what is misleading,
since it
| is incorrect.

I wrote an additional letter stating it would be wise to address both
developers and authors differently in this matter.  To a developer that
is not going to process SGML, he better stop after the <! and wait for
a >, because the part of trying to find the -- will only complicate
matters.

| Please, to keep this discussion on track, it would be very helpful to
cite
| references for facts, and to clearly identify opinions which differ
from the
| facts as such.

To clearly define "SGML Parser" all I can say is "A Parser which can
meaningfully process SGML declarations and tags" where an HTML parser
simply "lacks any SGML declaration knowledge or use".

|    To include comments in an HTML document, use a comment
declaration. A
|    comment declaration consists of `<!' followed by zero or more
|    comments followed by `>'. Each comment starts with `--' and
includes
|    all text up to and including the next occurrence of `--'. In a
|    comment declaration, white space is allowed after each comment,
but
|    not before the first comment.  The entire comment declaration is
|    ignored.


Yea, lets look at this.  An HTML comment begins with a <! and --, but
it never said what happens to the stuff between.  Now an HTML parser
could be wrote to flag anything between as an ERROR.  Is that what you
want?  No, you want an HTML parser to simply ignore the possible SGML
declarations it knows nothing about.  You can't tell people about this
without telling them all about SGML.

|       NOTE - Some historical HTML implementations incorrectly
consider
|       any `>' character to be the termination of a comment.

And tell me why they can't?  A reserved character SHOULD NEVER be used
in such a system.  If < denotes the start and > denotes the end, then a
> inside such constructs can not be used.  I don't know what in SGML
has allowed such constructs, but I do know protocols, and this is a
large error in a protocol, one that makes the protocol buggy and
usually kills the protocol in a short matter of time.  Its what
escaping was designed for, and all successful protocols use it.

| [2] Consider the historical case of distinguishing between the number
zero
| and the letter O. For many years, engineers have used a slash through
the
| number 0. This convention was implemented in a number of hardware
devices,
| including various IBM printers during the mainframe era. Apparently
ignorant
| of this convention, an minister turned computer programming book
author
| noted the confusion and introduced the convention of using a slash
through
| the letter O. As you can imagine, while each usage was rather
arbitrary,
| things got very confusing when programmers (using slash with letter
O) got
| their printouts (using slash with number 0).

And so to each their own.  Thats what happens when a company with the
name IBM adopts something, A lot of people refuse to use "GUI"'s
because of Microsoft's claim on Windows...  Does it surprise you than
an author would purposely do something like that?  It doesn't to me.

Received on Friday, 15 November 1996 22:14:54 UTC