Re: DTD Modularity [Was: Glue RFC]

Daniel W. Connolly (connolly@beach.w3.org)
Wed, 13 Sep 1995 12:12:28 -0400


Message-Id: <199509131612.MAA02257@beach.w3.org>
To: murray.altheim@nttc.edu
Cc: Multiple recipients of list <www-html@w3.org>
Subject: Re: DTD Modularity [Was: Glue RFC] 
In-Reply-To: Your message of "Wed, 13 Sep 1995 10:39:38 EDT."
             <v02110102ac7c982aa839@[192.188.119.193]> 
Date: Wed, 13 Sep 1995 12:12:28 -0400
From: "Daniel W. Connolly" <connolly@beach.w3.org>

In message <v02110102ac7c982aa839@[192.188.119.193]>, Murray Altheim writes:
>>In message <11900.9509121029@afs.mcc.ac.uk>, lilley writes:
>>>Dan Connolly <connolly@beach.w3.org> said:
>>
>>I'd love to see a nice, clean, formally specified mechanism for all future
>>HTML evolutional changes. But I haven't. It's really that simple.
>
>Dan (possibly Joe English??),
>
>Is there some type of feature in SGML that allows for document modularity?
[...]
>If such a software mechanism doesn't exist to support the formal
>specification process, maybe this is the tool we need to be creating.

This is not in the charter of html-wg, so I'm replying on www-html.

SGML has some DTD modularity/re-use mechanisms: parameter entities and
marked sections. They have nearly the same expressive power as the C
mechanisms of #define and #include -- the difference is that #define
can take arguments, and parameter entities cannot.

So building a nice, modular DTD is akin to building nice, modular C
programs: it's messy as all hell. (witness: "I'll put Tk_ before all
global names in my project. I hope there are no collisions." And
that's an example of an *extremely well engineered* C library) SGML is
worse than C in that (1) there are no local names (well, except maybe
attribute value enumerations) and (2) there's no separation of
"declarations" from "definitions".

I've done much cogitating on the idea of specifying HTML as a straight
context free grammar in s-expressions:

	(html -> head body)
	(head -> head-start? head-elt* head-end?)
	(head-start? -> )
	(head-start? -> <HEAD>)
	(head-elt* -> )
	(head-elt* -> head-elt* head-elt)
	(head-elt -> title-elt)
	(head-elt -> meta-elt)
	...

and using scheme/lisp to (1) condense the grammar down to a manageable
set of declarations by making up some notation, and (2) process the
grammar and build first/follow tables and such.

Who knows: we might even end up with an STk based direct-manipulation
grammar editor. I wouldn't be surprised such a beast already exists,
and we could just use it.

Dan