HTML 4.0 - nitpicking "A brief SGML tutorial"

Dianne Gorman (
Wed, 16 Jul 1997 05:26:52 +0000

Message-Id: <>
From: "Dianne Gorman" <>
Date: Wed, 16 Jul 1997 05:26:52 +0000
Subject: HTML 4.0 - nitpicking "A brief SGML tutorial"

A few comments on the SGML tutorial:

Firstly, could I second the earlier pleas for the specification to 
contain a fuller discussion of both HTML comment syntax and 
collapsing white space.

  :  2.2 Elements [1]

  :  The SGML definition of HTML specifies that some HTML elements 
  :  are not required to have end tags. The definition of each 
  :  element in the reference manual indicates whether it requires an 
  :  end tag.
Could there also be a brief mention of the few elements that do 
not require start tags.  This pops up reasonably frequently on 
newsgroups etc.

  :  3.3 Entity Definitions [2]
  :  You will encounter two DTD entities frequently in the HTML DTD: 
  :  %inline and %block. They are used when the content model 
  :  includes inline and block level elements respectively.
I think this should be %blocklevel rather than %block, since
<!ENTITY % block "(%blocklevel | %inline)*">

  :  3.4 Element Definitions - Content model definitions [3]
  :  A | B
  :  Both A and B are permitted in any order.
I thought this meant either A _or_ B

  :  A & B
  :  A and B must both occur once, but may do so in any order.
Could the word "once" be omitted (leaving the frequency issue to the 
definitions of the occurrence indicators)? It seems to me that, as it 
stands, it could be read as meaning they must occur once only, 
confusing the role of the connectors and the frequency indicators.

  :  In this example, the -(A) signifies that the element A cannot be 
  :  included in another A element (i.e., anchors may not be nested).
  :  <!ELEMENT A - - (%text)* -(A)>
  :  Note that the A element is part of the DTD entity %inline, but 
  :  is excluded explicitly because of -(A).
The content model in the element declaration should contain %inline, 
not %text.

  :   3.5 Attribute Definitions - DTD entities in attribute 
  :        definitions=A0[4]
  :  In this example, we see that the attribute definition list for 
  :  the LINK element begins with the %attrs entity.
  :  %attrs; -- id, class, style, lang, dir, title --
  :  href %URL #IMPLIED -- URL for linked resource --
  :  ...more of the definition...
  :  >
  :  The %attrs entity expands to:
  :  <!ATTLIST P
  :  id ID #IMPLIED -- document-wide unique id --
  :  class CDATA #IMPLIED -- comma list of class values --
  :  style CDATA #IMPLIED -- associated style info --
  :  title CDATA #IMPLIED -- advisory title/amplification --
  :  lang NAME #IMPLIED -- [RFC1766] language value --
  :  dir (ltr|rtl) #IMPLIED -- direction for weak/neutral text --
  :  align (left|center|right|justify) #IMPLIED
  :  >
  :  The %attrs entity has been defined for convenience since these 
  :  seven attributes are defined for most HTML elements.
The example starts with LINK but changes to P, which is a bit 
confusing.  %attrs now eventually expands to a few more than 7 
attributes, and no longer includes align. Perhaps BR and %coreattrs 
would provide a simpler example?

  :   Simiarly, the DTD defines the %URL entity as expanding 
  :  into the string CDATA.
  :  -- The term URL means a CDATA attribute
  :  whose value is a Uniform Resource Locator,
  :  See [RFC1808] and [RFC1738]
  : -->
The comment in the current DTD is now actually
    -- a Uniform Resource Locator,
       see [RFC1808] and [RFC1738]

  :  As this example illustrates, the entity %URL provides readers of 
  :  the DTD with more information as to the type of data expected 
  :  for an attribute. Similar entities have been defined for %color,
  :  %Content-Type, %Length, %Pixels, etc. 
It is now called %ContentType (no hyphen).

OK.  I know I've been particularly pedantic.  If anyone wants to 
nitpick in return, you might like to point out to me the errors in my 
hierarchy of content models for HTML 4.0 at


Dianne Gorman (