Re: B.7 Conditional inclusion in DTDs? from Len Bullard on 1996-10-15 (w3c-sgml-wg@w3.org from October 1996)

From: Len Bullard <cbullard@HiWAAY.net>
Date: Tue, 15 Oct 1996 10:58:01 -0500
To: "Eve L. Maler" <elm@arbortext.com>
CC: Charles@sgmlsource.com, Michael Sperberg-McQueen <U35395@UICVM.CC.UIC.EDU>, W3C SGML Working Group <w3c-sgml-wg@w3.org>
Message-ID: <3263B489.199A@HiWAAY.net>

Eve L. Maler wrote:
> 

> As one of the perpetrators of outrageously parameterized DTDs,
> I have to disagree.  Most DTDs don't need and shouldn't use
> parameterization on the level of a DocBook or TEI, but any DTD
> with a very diverse user base typically needs customization in
> order to be useful.  (Many more people cannibalize DocBook than
> use it out of the box.)

We have to agree to disagree and then agree.  As someone sitting here at
the 
moment with a requirement to write a stylesheet for Docbook,
as beautiful a piece of SGML as it is, it is awfully hard 
to tell the user of for example, Author/Editor why some of these
combinations are legal.  But, yes, it can be subset and that 
is the typical solution.  Two years ago we brought up the 
use of subsets and provable subsets on CTS for this reason.

> Since these audiences don't share any particular DTD management
> system, the ability to customize is needed in 8879 form.  And
> maintenance truly is a nightmare if you're maintaining both your
> changed portions and the original portions, which you need to do
> whenever the base DTD gets an upgrade (as DTDs with huge audiences
> tend to do regularly).  Plus, a DTD customization layer provides a
> machine-readable (and relatively human-readable) specification of
> precisely how your markup model differs from the base one.

What I would trade is the nightmare of the DTD maintainer for 
the production speed and quality of the author.  DTDs with huge
audiences can usually break up the audience into smaller domains.

> DTD analysis tools are popular and available enough to help people
> wend their way through forests of parameter entities, to the point
> where DTD readability is much less of a concern (though I still feel
> that readability should be an important *secondary* goal of a raw DTD).

I agree, but readability isn't my problem, after the first hour.  It
is using a relatively cheap seat and employee to produce a quality 
result by constraining and directing their effort via the DTD.

> Finally, in any one "reading" of a parameterized DTD, the markup
> model is stable; in my experience, authors pay no extra price for
> DTDs that are parameterized.  In fact, the ability to rip out the
> elements that are irrelevant to one's own situation has meant that
> DocBook (for one) has become quite a bit *more* usable.

Parameterization is not at issue although it can contribute to the 
problem if one parameterizes prior to understanding the root-to-leaf
branches, or simpler, the *engine* productions where actual content
is inserted such that absurd productions don't occur such as footnotes 
in titles or some such.  What tends to happen in the large DTDs whose 
domains are unconstrained is simply too many potential choices.  Then 
the cost of the employee goes up as the requirement for deep content 
knowledge increases.  This is tough to justify in conversion work, 
and often, in technical writing where the actual content is generated 
upstream, but there is no configuration management, or database.

> The management of SGML documents and associated tools (including
> DTDs) is still in its infancy.  I believe in using the 8879 techniques
> at my disposal until a portable, functional way of doing the same
> thing arises.  Eventually, the XML user base should be let in on
> these methods as well (whatever they are at the time).

As long as we agree that to focus the production effort, subsets or 
cannibalization might be required, I think we agree.  We probably 
should move this to CTS.

Len Bullard

Received on Tuesday, 15 October 1996 11:58:19 UTC