A modest proposal (DTD for DTDs revisited)

I'd like to propose what I take to be an improved approach to the syntax of
such things (taking into account some of the decisions already taken). I am
also putting in the things that I think we should have, that I don't expect
some people to like. I'm also going to just give examples, rather than a
grammar. If the examples aren't clear enough to make sense, then that's
useful data itself.

The lack of formal semantics also reinforces the notion that this is a
preliminary proposal, not a hard and fast prescription. I hope that it may
be a wooden (or at least wicker)-man, but even if only a strawman, it may
serve a purpose. The syntax is not defined because it will be whatever
syntax we finally choose for document instances. I have used the proposed
<sdf/> empty tag syntax, as a shining example of syntactic disambiguation.

<document-type tag-root="document"/>

<define-tag name="document" contains="chapter+"/>
<define-tag name="chapter" contains="title, p+"/>
<define-tag name="p" contains="#DATA em strong li footnote"/>
                               <!-- above == (#PCDATA | em | strong ...)* -->

<define-tag name="em" contains="#MODEL low-soup"/>
<define-tag name="strong" contains="#MODEL low-soup"/>
<define-model name="low-soup" contains="#DATA em strong"/>

<define-attribute name="xref" type="#IDREF" value-spec="required"/>
<define-attribute name="index-term" type="#DATA" value-spec="implied"/>
<define-attribute name="pickable" type="yes|no" value-spec="#DEFAULT yes"/>

<!-- yes, you can have a defualt value with a # in it -->
<define-attribute name="weirdo" type="#DATA" value-spec="#DEFAULT #DEFAULT"/>

<!-- other value-specs are possible -->
<define-attribute name="Hy-foo" type="#DATA" value-spec="#FIXED manly-form"/>

<!-- let's make attlists separate from elements, separate from attributes -->
<!-- A shortcut to define an attlist, and apply it to an element in one
     declaration, possibly without even naming it, would prbably be a good
     idea. Perhaps adding:
         apply-to="element1 element2"
     to define-attlist would do. -->

<define-attlist name="highlighted" atts="xref index-term pickable"/>
<apply-attlist name="highlighted" to="em title"/>

<define-attlist name="hyper-highlight" atts="weirdo Hy-foo pickable"
<dtd-doc><p>This is the way comments should really be done.</p
><p>We can even add a way to define a dtd fragment for the dtd-doc
>contents. To do this we need to find a way to define the DTD for the
>contents without preventing the DTD itself from serving as an

<apply-attlist name="hyper-highlight" to="strong"/>

<!-- Note: except for relaxing the unique keyword-value restriction, all
     of this can be compiled to SGML by a pretty simple processor. -->

<!-- some things we might do in whatever we choose as the doc subset -->

<define-tag name="codeword" occurs-in="p"/>
<!-- a mix-in tag. Valid at any point in processing of top level of a <p>.
     Not an inclusion exception -->
<define-tag name="deemphasize" occurs-in="p em strong"/>
<!-- some tags mix more than others -->

<define-tag name="footnote" contains="#DATA em strong li footnote"
            legal-anywhere-within="p" illegal-anywhere-within="footnote"/>
                <!-- maybe something like contains="#MODEL p" would be nice
                     for inheriting content models from existing elements -->
                <!-- notice that we can't comment within a tag. -->
                <!-- we keep inclusion and exclusion, but don't make them
                     too easy to abuse -->

Other syntactic options might also make sense:

    We should probably use an ID attribute instead of the name attribute in
the above defining forms (to aid in cross-references, and the like). The

   Even with the fixed name, it doesn't look too bad:
<define-tag id="chapter" contains="title, p+">

   On the other hand, since attributes, content-models, tags, and attlists
all havewe should perhaps allow multiple ID attributes per element, and
vary the name of the attribute for the item being defined, to make DTD
processing and cross-referencing with XML tools easier.

   I abandoned SGML technical terms for common parlance in the definition
syntax. This could create some confusion for SGML geeks, but removes a
significant barrier to initial understanding (and hence enthusiasm) for the
HTML user in the street. The distinction between tag and element is useful,
but can be pretty well approximated by "tag" and "tag and its contents." I
think I went a bit too far in simplifying things, but if it would increase
the user-base that would be a net gain.

    This contains just about everything I would want in a DTD language, in
a syntax that I would prefer. I didn't include entity declaration syntax,
but you can imagine. I would (of course) include a "new-character"
declaration that give an entity name, and descriptive text, and would map
to an SGML SDATA entity.

   Apologies for any infelicities or typos in the above, but I am in need
of rest now.

   -- David

David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________