Really ghastly use of markup terminology

Murray Altheim (
Fri, 24 Jan 1997 12:35:20 -0400

Message-Id: <v02140b00af0e74f23104@[]>
Date: Fri, 24 Jan 1997 12:35:20 -0400
From: (Murray Altheim)
Subject: Really ghastly use of markup terminology

Open letter to Netscape:

For a company whose living is derived from the Web and HTML, the Extensions
to HTML 2.0, Extensions to HTML 3.0 and other documents listed at

show a surprising lack of understanding of basic markup terminology. This
is not only confusing to new authors, it is a potential miseducation that
damages their understanding of the technology: if they form an
understanding about the terminology from these pages, they may have
difficulty understanding other ideas based on these foundations.

Most aggregrious:


Your pages commonly use 'tag' for 'attribute' and 'element' for 'tag', and
certainly don't give any indication of the differentiation between element
and tag. WIDTH is an attribute on the IMG element, not a 'tag' on the IMG
'tag'. Let's discuss these terms by way of example:

     <A HREF="index.html">Return to Index</A>

What have we here?

     This is an 'A' (anchor) element. Yes, the whole thing. Actually, the
     tags delimit the element's content.

     It is composed of multiple parts:

     The start tag:       <A HREF="index.html">
     Element content:     Return to Index
     The end tag:         </A>

     The start tag may contain an attribute specification list. This is a
     whitespace-delimited list of attribute specifications.

     An attribute specification in HTML may take two forms, due to
     the fact that SHORTTAG YES is specified in HTML's SGML declaration.

     The first form is the most common:


     where 'HREF' is considered the attribute name, followed by an equal sign,
     then an attribute value specification: either an 'attribute value literal'
     or an 'attribute value'. If it's quoted, it's the former, unquoted it's
     the latter. Why the difference? Because the only time you're allowed to
     leave off the quotes is when the content consists of only SGML NAME
     characters: it must start with [a-zA-Z0-9] and consist of only
     [a-zA-Z0-9] characters, plus hyphens or periods. This is the reason your
     use of <FONT SIZE=+1> is incorrect; it needs to be quoted, due to the
     presence of the plus sign.

     The attribute value literal is formally the content typed between the
     quotes. What the parser/process derives from that is called the
     'attribute value'. It's an important distinction, because the attribute
     value is the result of entity or character entity replacements within
     the literal. The attribute value is what is used by the browser.

     The second form is used in HTML in a sort of bass-ackward manner.
     Minimization rules allow the attribute name and VI to be left out,
     so the attribute value specification is 'self-describing:'

         <IMG SRC="foo.gif" BORDER>

     which actually required a hack in the HTML DTD to allow it. Its non-
     minimized form is:

         <IMG SRC-"foo.gif" BORDER=BORDER>

     Elements that have neither content nor an end tag are considered
     'empty elements', eg., IMG.

Making a few changes to these pages would be of great service to the Web
community. Also, let's not pretend that Netscape was active in the IETF
HTML working group, or the false notion that Netscape's extensions didn't
break existing browsers. Frames? Come on, get real. They still break

BTW, it'd be lovely if you folks one day released a Netscape DTD.

A copy of this page is also online at


    Murray Altheim, Program Manager
    Spyglass, Inc., Cambridge, Massachusetts
    email: <>
    http:  <>
           "Give a monkey the tools and he'll eventually build a typewriter."