Re: Why DOCTYPE Declarations for XHTML?

On Sat, 15 Jan 2000, Murray Altheim wrote:
> Daniel Hiester wrote:
> > 

> > > The interoperability is supposed to stem from the verifiable
> > > validity of the markup, i.e. that the content conforms to a certain
> > > type.  The *mere* presence of a doctype declaration - as some
> > > inscrutable string - has no bearing on the needed guarantee.  IMHO.
> 
> That's how in XML 1.0 one attaches a DTD to a document instance. 

In W3C-speak, it's how one specifies a (machine processable) schema for
a document instance.  Cf. 4.103 in ISO8879:

: 4.103 (document) type declaration: A markup declaration that formally
: specifies a portion of a document type definition.
: NOTE - A document type declaration does not specify all of a document
: type definition because part of the definition, such as the semantics
: of elements and attributes, cannot be expressed in SGML.  [...]

Summary: "specifies a definition", not "declares the type".  Yet, the
mythology simply refuses to die that somehow a *document type* is being
"declared".  Daniel has put a finger on the pulse of the real underlying
issue, and once again I'll cite Eliot Kimber's explanation of why the
doctype declaration - or worse, formulaic scrutiny of a DTD FPI - is NOT
the way to invoke a document type as the semantic intent.  The syntactic
validity of the instance, which is where the doctype declaration fits in,
is orthogonal to this very real need.

  http://www.dejanews.com/getdoc.xp?AN=325927738

In fact, the XHTML Modularization document seems subject to the same
erroneous belief.  In 3.1 of the Conformance section, there is:

:  2. The document type must have a unique identifier as defined in 
:     Naming Rules.
:  3  The document type must include, at a minimum, the Structure,
:     Hypertext, Basic Text, and List modules defined in this
:     specification.

In both, 'document type' needs to be replaced by 'DTD' or 'declaration
subset', because 'document type' in the Terms section is clearly semantic
in its intended meaning/implication.  Especially 2 - the unique identifier
is the "name" of a declaration subset, not a document type.

Pardon me, Murray, but it seems you subscribe to the belief too, when you 
write:

> The reason why XHTML requires a DOCTYPE is that we're not defining a
> 'tag set' (to use I believe Dave Raggett's terminology), we're
> defining a 'markup language'.

and 

> But the presence of DOCTYPE declares what document type the author (if
> they are even aware of this, given most HTML editors) aspires their
> markup to conform to.

Is it also safe to conclude that the consensus of the WG (if the question
were ever raised) would be that Eliot is wrong? 

> By declaring conformance to XHTML as being able to validate a document
> according to an XHTML DTD we're attempting to guarantee a level of
> interoperability. 

The requirement is a statement about the *effective content* of a doctype
declaration ("specifies a definition" ==> definition actually specified)
in a putative XHTML document.  Something on the order of:

   the (contents of) "-//W3C//DTD XHTML 1.1//EN", 
   the whole (contents of) "-//W3C//DTD XHTML 1.1//EN", 
   and nothing but the (contents of) "-//W3C//DTD XHTML 1.1//EN".

One simple way to do that (like the older HTML specs) is to prohibit
internal subsets and require the specific FPI.

> This isn't exactly new technology; people have been successfully
> using DOCTYPE declarations for this purpose for probably twenty years.

It has been no small mercy that they didn't have to contend with
open-ended systems like the Web, where a FrontPage can extrude <FONT> and
<TABLE> and label the stuff "-//IETF//DTD HTML 2.0//EN", or a Netscape
Composer can decide lower case "-//w3c//dtd html 4.0 transitional//en"
must be kind cool, or a Mozilla Project can seriously consider varying its
processing mode by checking for "HTML 4.0" in the "doctype tag".


Arjun

Received on Sunday, 16 January 2000 02:55:45 UTC