Re: KISS (was: Parameter entity references in WF docs) from Martin Pike on 1997-06-03 (w3c-sgml-wg@w3.org from June 1997)

From: Martin Pike <mp@stilo.demon.co.uk>
Date: Tue, 3 Jun 1997 12:12:30 +0000
To: w3c-sgml-wg@w3.org
Message-ID: <865336399.1024029.0@stilo.demon.co.uk>
Having implemented PEs in an SGML parser I can appreciate Tim's 
feeling about the amount of code that is required to implement them. 
However, I feel that XML without them would be a much poorer language, 
unable to cope with some of the problems that it will be called upon to satisfy.

Eve Maler and Terry Allen have written about the ability to build extensible 
and configurable DTDs by using them. Len Bullard stated that they see the 
future of XML in documents written using small neat DTDs, with no need for 
PEs. Martin Bryan has argued that their are applications that do 
need large, complex DTDs and that PEs are a godsend in writing and 
maintaining these.

I would like to add that even in relatively simple DTDs I have found PEs 
useful, much in the same way that even in a small computer program 
I use subroutines or methods, to split the problem into manageable, 
trackable, reusable chunks.

We have all used the argument for markup being important in the re-use of 
data. I feel the same way about PEs allowing the re-use of DTD 
fragments.

I know that the argument will come back that there are few people 
writing DTDs. There are, but they are usually skilled and costly bodies. If XML 
is going to take off as we hope it will then it is likely that there will be many 
more DTDs required. I have talked to companies that have wanted to 
adopt SGML but have been unwilling because of the cost and difficulty. XML will bring the 
cost of implementation down and hopefully the difficulty. These companies 
I am sure are going to want their data structured in a defined manner, ie. via DTDs. 
If common elements of a company's data structure are able to be shared between different 
DTDs then the cost of building the corporate structure and maintaining it must be less.

As to adopting PEs at a later stage if necessary, they are out there being used in 
XML applications now.

A project that benefits greatly from PEs is MathML. This project has 
been mentioned a few times already in this forum.  DTD fragments are being 
created to represent mathematics in markup on the Web. The fragments are 
XML conformant. The removal of PEs will complicate this effort. 
Already the inability to use name groups to declare multiple elements with 
the same content model makes the DTD much larger and more awkward to 
read and maintain than its SGML equivalent. The inability to declare even 
model groups as PEs will make it even more so. 

The reason is that the representation of mathematics in a structural manner 
is highly recursive. Therefore each element has a content model that contains all 
the others. To cover all mathematical functions necessitates the declaration of 
hundreds, if not thousands, of elements - all with the same content model. If 
each element also had to be added to the content model, instead of the well-
structured, comprehensible set of PEs - each representing a topic of 
mathematics, that is used at the moment, then the DTD would become 
unmanageable and unreadable. This project is not one in which SGML can be 
easily used instead. It is for the Web under the auspices of W3C.

Of course pre-processors could be used, but as Eve Maler points out this 
'leaves DTDs as intermediate files that it's perilous to edit directly.'

This diatribe has been concerned with the use of PEs within a DTD 
and its subset. Am I missing something or is this the only place where they 
need to be used, given the demise of all marked section types other than 
CDATA in the markup. If this is the case processors that are non-validating 
will not have to decipher them anyway, will they? And isn't that the crux of 
this argument; that it should be -possible- to build lightweight stand-alone 
processors? As Norbert says those people who need to use and verify more 
complex DTDs will need more sophisticated tools.



        Martin Pike,     Stilo Technology      
        -------------------------------------
        Email: mp@stilo.com or mp@stilo.demon.co.uk
        Phone/Fax: +44 (0) 1222 483530
        WWW:  http://www.stilo.com/
        -------------------------------------
Received on Tuesday, 3 June 1997 08:16:14 UTC