HTML 3.2: HEAD content model

Joe English (joe@trystero.art.com)
Thu, 16 May 1996 15:07:20 PDT


Message-Id: <9605162207.AA18925@trystero.art.com>
To: www-html@w3.org
Cc: connolly@w3.org
Subject: HTML 3.2: HEAD content model
Date: Thu, 16 May 1996 15:07:20 PDT
From: Joe English <joe@trystero.art.com>



Regarding the HTML 3.2 DTD, revision 1.1, stamped

    $Id: HTML3.2.dtd,v 1.1 1996/05/06 22:11:23 connolly Exp $


The content model for the HEAD element is more restrictive
than it was in HTML 2.0.  It is unclear whether this was
intended; however, it is possible that some documents that were
validated against the "HTML 2.0 Strict" DTD will not validate
against the new one.


Summary of the problem:

HTML 2 explicitly allows META and LINK elements to appear anywhere
inside the HEAD element, but the Wilbur content model:

    <!ENTITY % head.content "TITLE & ISINDEX? & BASE? & STYLE? &
                            SCRIPT* & META* & LINK*">

requires all the METAs to appear together, all the LINKs to
appear together, et cetera.  For example, it allows the sequences

	TITLE, META, META, LINK, LINK

and

	LINK, LINK, TITLE, META, META

but does *not* allow:

	LINK, META, TITLE, LINK, META


Suggested fix:

There's really no elegant way to specify the desired syntax
in SGML.  HTML 2.0 used an inclusion exception on the HEAD element
(to allow META and LINK "anywhere inside"), and a corresponding
exclusion exception on TITLE (to say "except here").  This
approach would work; care must also be taken to remember the
exclusion exceptions on all new non-empty HEAD elements
such as SCRIPT and STYLE.

Another solution is:

    <!ENTITY % head.any "(SCRIPT | META | LINK)*"
	-- repeatable HEAD elements -->

    <!ENTITY % head.content
	"(%head.any;,
	   ((TITLE,	%head.any;)  &
	    (ISINDEX,	%head.any;)? &
	    (BASE,	%head.any;)? &
	    (STYLE,	%head.any;)? ))"
    >

This is also rather awkward, but it does keep all the
awkwardness localized to one declaration, and avoids the
use of inclusion exceptions for elements which are not,
semantically, inclusions.

A third solution -- which would not be suitable for a descriptive DTD
like HTML 3.2 but might be useful for a prescriptive one -- would be
to observe that since the relative order of the elements in the HEAD
does not matter, it is just as well to _choose an arbitrary order
and enforce that_.  For example:

    <!ELEMENT HEAD - -
	    (TITLE, BASE?, ISINDEX?, META*, LINK*, STYLE*, SCRIPT*) >

This particular order was chosen to make sure that the BASE
element precedes any LINK or ISINDEX elements which it might affect,
and that the elements important to search engines -- TITLE, META, and
LINK -- appear before any STYLE or SCRIPT elements (which might cause
HTML Level 2-aware agents to prematurely terminate the HEAD element).
That is, while it is not backwards-compatible with _HTML 2.0 documents_,
it would ensure that new documents were backwards-compatible
with _HTML Level 2 agents_.

Less restrictive variations such as:

    <!ELEMENT HEAD - -
	    (TITLE, BASE?, ISINDEX?, (META | LINK | STYLE | SCRIPT)*) >

are also possible of course.  Again, this would not be appropriate
for a descriptive DTD like HTML 3.2, but may be suitable for another
DTD designed to prescribe recommended practice.


*EOF*

--Joe English

  joe@art.com