- From: Martin Pike <mp@stilo.demon.co.uk>
- Date: Tue, 3 Jun 1997 12:12:30 +0000
- To: w3c-sgml-wg@w3.org
Having implemented PEs in an SGML parser I can appreciate Tim's feeling about the amount of code that is required to implement them. However, I feel that XML without them would be a much poorer language, unable to cope with some of the problems that it will be called upon to satisfy. Eve Maler and Terry Allen have written about the ability to build extensible and configurable DTDs by using them. Len Bullard stated that they see the future of XML in documents written using small neat DTDs, with no need for PEs. Martin Bryan has argued that their are applications that do need large, complex DTDs and that PEs are a godsend in writing and maintaining these. I would like to add that even in relatively simple DTDs I have found PEs useful, much in the same way that even in a small computer program I use subroutines or methods, to split the problem into manageable, trackable, reusable chunks. We have all used the argument for markup being important in the re-use of data. I feel the same way about PEs allowing the re-use of DTD fragments. I know that the argument will come back that there are few people writing DTDs. There are, but they are usually skilled and costly bodies. If XML is going to take off as we hope it will then it is likely that there will be many more DTDs required. I have talked to companies that have wanted to adopt SGML but have been unwilling because of the cost and difficulty. XML will bring the cost of implementation down and hopefully the difficulty. These companies I am sure are going to want their data structured in a defined manner, ie. via DTDs. If common elements of a company's data structure are able to be shared between different DTDs then the cost of building the corporate structure and maintaining it must be less. As to adopting PEs at a later stage if necessary, they are out there being used in XML applications now. A project that benefits greatly from PEs is MathML. This project has been mentioned a few times already in this forum. DTD fragments are being created to represent mathematics in markup on the Web. The fragments are XML conformant. The removal of PEs will complicate this effort. Already the inability to use name groups to declare multiple elements with the same content model makes the DTD much larger and more awkward to read and maintain than its SGML equivalent. The inability to declare even model groups as PEs will make it even more so. The reason is that the representation of mathematics in a structural manner is highly recursive. Therefore each element has a content model that contains all the others. To cover all mathematical functions necessitates the declaration of hundreds, if not thousands, of elements - all with the same content model. If each element also had to be added to the content model, instead of the well- structured, comprehensible set of PEs - each representing a topic of mathematics, that is used at the moment, then the DTD would become unmanageable and unreadable. This project is not one in which SGML can be easily used instead. It is for the Web under the auspices of W3C. Of course pre-processors could be used, but as Eve Maler points out this 'leaves DTDs as intermediate files that it's perilous to edit directly.' This diatribe has been concerned with the use of PEs within a DTD and its subset. Am I missing something or is this the only place where they need to be used, given the demise of all marked section types other than CDATA in the markup. If this is the case processors that are non-validating will not have to decipher them anyway, will they? And isn't that the crux of this argument; that it should be -possible- to build lightweight stand-alone processors? As Norbert says those people who need to use and verify more complex DTDs will need more sophisticated tools. Martin Pike, Stilo Technology ------------------------------------- Email: mp@stilo.com or mp@stilo.demon.co.uk Phone/Fax: +44 (0) 1222 483530 WWW: http://www.stilo.com/ -------------------------------------
Received on Tuesday, 3 June 1997 08:16:14 UTC