- From: Michael Sperberg-McQueen <U35395@UICVM.CC.UIC.EDU>
- Date: Fri, 13 Sep 96 14:12:46 CDT
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
If I understand them correctly, it seems that Paul Grosso, Len Bullard, and Robin Cover have all raised the same concern, by pointing to situations in which it's necessary to have a DTD, whether for authoring, for contractual purposes, or for other applications. Let's distinguish four cases: a DTDs Required for Parsing. Declarations are always required, and in practice all applications must process them on each run, since it's impossible to parse the document correctly without knowing which elements are EMPTY, which are CDATA, etc. (It's also possible to cache the essential information in some other form, but that's just a standard store-for-compute tradeoff.) This, roughly, is the situation with 8879:1986. b DTDs Required for Validation Declarations are always required, but the language is so constructed that the document can at least be parsed correctly* without reading the DTD.r declarations need not always be read. I.e. it's possible to do some kinds of useful work even without reading all the declarations. This, roughly, is what SGML would be like if the ETAGC proposal is adopted (at least for Minimal SGML documents without references to external entities). * (within some limits -- element boundaries and content will be correctly identified; some non-significant white space may be preserved unnecessarily) c DTDs Optional Declarations are always allowed, but not always required; the system makes certain default assumptions if no declarations are provided. This is the approach taken by PSGML. (Or C, if a programming-language analogy is useful. In programming, I don't find this helpful at all, but Tim and others have suggested plausibly that it may be useful in XML.) d DTDs Forbidden Declarations are never allowed; the system makes certain assumptions about things, and your usage had better agree. I don't know of any serious markup languages that do this; the only analogy I can think of is Basic, in the form that requires names of string variables to end in $ and so on. ---- Cover, Grosso, and Bullard seem to be arguing against (d), but I'm not sure whom they are arguing against. My own reading of the goals document is that (b) or (c) should apply -- or at least, that (a) is not what's wanted here. I don't think anyone is actually in favor of (d), and if the current phrasing of the goals statement gives readers that impression, then it needs to change. Can someone who finds the current phrasing confusing suggest a less confusing alternative? It does, on the other hand, seem to me that we haven't got a clear consensus on the choice between (b) and (c). Myself, I lean toward (c), because it makes startup for light-weight applications or experimentation a bit easier. Applications which need declarations will always be able to use them -- and XML processors (like good C compilers) can provide an option to warn, or treat as errors, any use of undeclared element types. Note that choosing (c) is not the same as saying validating editors are impossible, as Paul Grosso seems to suggest -- any more than C compilers can't do syntax checking, just because variables can be declared implicitly. A validating editor just doesn't belong to the class of applications for which declarations are inessential. When presented with an XML document which has no DTD, it might (a) warn about missing declarations, (b) silently assume <!ELEMENT foo - - ANY>, or (c) something else entirely. -C. M. Sperberg-McQueen ACH / ACL / ALLC Text Encoding Initiative University of Illinois at Chicago tei@uic.edu
Received on Friday, 13 September 1996 15:51:36 UTC