- From: W. Eliot Kimber <kimber@passage.com>
- Date: Mon, 28 Oct 1996 12:44:31 -0900
- To: w3c-sgml-wg@w3.org
At 10:31 AM 10/28/96 EST, lee@sq.com wrote: >> I think you're missing Charles' point: that one goal is for XML documents >> to *also* be SGML documents. To do that, they *must* have at least >> "<!DOCTYPE typename SYSTEM>" at their start. > >They must also have a DTD to be valid SGML documents. > >> Note that any *existing* SGML processor can consider this to be valid by >> defining its algorithm for resolving the omitted SYSTEM identifier to be >> "parse the document > >Please name any existing SGML processor that works this way today. >If you have to change the code, you're not talking about an >existing parser. I didn't say existing processors *do* work this way, I said they *can* work this way--in other words, there's nothing in 8879 that prescribes *how* you get the data that makes up an entity or how you determine what the actual system identifier is (and therefore what the data is that is addressed by that system identifier) when you omit the system identifier. Therefore, *it is technically possible* to modify the entity manager of any SGML processor to behave as I've described. Remember too that one of the goals of XML is to define a syntax that doesn't require *explicit* declaration of element types. Therefore, by definition, you never have to resolve the omitted system identifier in order to parse the document (although you may need it for validation). In the context of document delivery, explicit document type declarations primarily serve the *information receiver* by letting them check to see whether or not the data they get meets whatever requirements they have. When a receiver has this requirement they can provide an explicit DTD (or architecture) and if any documents don't validate against it, kick it back. Or, they can choose to only accept documents with explicit DTDs. Most of the time, information receivers don't care (e.g., casual Web browsers). But when the information is really data involved in some controlled business process, you do care. Note that one thing we haven't yet discussed (as far as I know) is the concept of being able to say "I don't care what the element type declarations for the document are, but I *do* care what architecture it claims to conform to." By putting the schema rules at the architecture level, you can have a system that allows individual documents to be ideosyncratic while still keeping a measure of schema identification and validation. For example, if on your internet you want to manage "reports" but you don't want to define some all-encompasing DTD for report documents that will meet all the requirements of report writers (and the high maintenance cost that implies), you can define a general "report architecture" from which any report document must be derived. Validation of documents *without explicit document types* can be done *by the receiver* through a combination of defaulting and a simple explicit mapping, done either on elements or through some prolog (such as you can do in SGML with LINK process declarations--not that I'd suggest actually using LINK for XML). One problem with DTDs as defined in 8879 is *that they don't tell you anything* about a document except how to validate it. However, architectures, because they represent a stand-alone set of defined semantics and schema rules, *do* tell you something because the name of an architecture points back to a *fixed* set of definitions (unlike the name of a document type, which only tells you the element type of the document element). Or said more simply: the idea in SGML that document types tell you something more useful about documents than how to parse and validate them syntactically them is a Big Lie. One you remove from the syntax those things for which you *must* have explicit element type declarations, you don't need DTDs for parsing, only for validation. Cheers, E. -- W. Eliot Kimber (kimber@passage.com) Senior SGML Consultant and HyTime Specialist Passage Systems, Inc., (512)339-1400 10596 N. Tantau Ave., Cupertino, CA 95014-3535 (408) 366-0300, (408) 366-0320 (fax) 2608 Pinewood Terrace, Austin, TX 78757 (512) 339-1400 (fone/fax) http://www.passage.com (work) http://www.drmacro.com (home) "If I never had existed, would you still remember me?..." --Austin Lounge Lizards, "1984 Blues"
Received on Monday, 28 October 1996 13:45:24 UTC