Message-Id: <email@example.com> Date: Fri, 24 Jan 1997 12:35:20 -0400 To: firstname.lastname@example.org From: email@example.com (Murray Altheim) Subject: Really ghastly use of markup terminology Cc: firstname.lastname@example.org Open letter to Netscape: For a company whose living is derived from the Web and HTML, the Extensions to HTML 2.0, Extensions to HTML 3.0 and other documents listed at http://home.netscape.com/assist/net_sites/ show a surprising lack of understanding of basic markup terminology. This is not only confusing to new authors, it is a potential miseducation that damages their understanding of the technology: if they form an understanding about the terminology from these pages, they may have difficulty understanding other ideas based on these foundations. Most aggregrious: Element Attribute Tag Your pages commonly use 'tag' for 'attribute' and 'element' for 'tag', and certainly don't give any indication of the differentiation between element and tag. WIDTH is an attribute on the IMG element, not a 'tag' on the IMG 'tag'. Let's discuss these terms by way of example: <A HREF="index.html">Return to Index</A> What have we here? This is an 'A' (anchor) element. Yes, the whole thing. Actually, the tags delimit the element's content. It is composed of multiple parts: The start tag: <A HREF="index.html"> Element content: Return to Index The end tag: </A> The start tag may contain an attribute specification list. This is a whitespace-delimited list of attribute specifications. An attribute specification in HTML may take two forms, due to the fact that SHORTTAG YES is specified in HTML's SGML declaration. The first form is the most common: HREF="index" where 'HREF' is considered the attribute name, followed by an equal sign, then an attribute value specification: either an 'attribute value literal' or an 'attribute value'. If it's quoted, it's the former, unquoted it's the latter. Why the difference? Because the only time you're allowed to leave off the quotes is when the content consists of only SGML NAME characters: it must start with [a-zA-Z0-9] and consist of only [a-zA-Z0-9] characters, plus hyphens or periods. This is the reason your use of <FONT SIZE=+1> is incorrect; it needs to be quoted, due to the presence of the plus sign. The attribute value literal is formally the content typed between the quotes. What the parser/process derives from that is called the 'attribute value'. It's an important distinction, because the attribute value is the result of entity or character entity replacements within the literal. The attribute value is what is used by the browser. The second form is used in HTML in a sort of bass-ackward manner. Minimization rules allow the attribute name and VI to be left out, so the attribute value specification is 'self-describing:' <IMG SRC="foo.gif" BORDER> which actually required a hack in the HTML DTD to allow it. Its non- minimized form is: <IMG SRC-"foo.gif" BORDER=BORDER> Elements that have neither content nor an end tag are considered 'empty elements', eg., IMG. Making a few changes to these pages would be of great service to the Web community. Also, let's not pretend that Netscape was active in the IETF HTML working group, or the false notion that Netscape's extensions didn't break existing browsers. Frames? Come on, get real. They still break browsers. BTW, it'd be lovely if you folks one day released a Netscape DTD. A copy of this page is also online at http://www.cm.spyglass.com/doc/mlterms.html Murray ``````````````````````````````````````````````````````````````````````````````` Murray Altheim, Program Manager Spyglass, Inc., Cambridge, Massachusetts email: <mailto:email@example.com> http: <http://www.cm.spyglass.com/murray/murray.html> "Give a monkey the tools and he'll eventually build a typewriter."