- From: Paul Prescod <papresco@calum.csclub.uwaterloo.ca>
- Date: Sat, 21 Sep 1996 16:13:47 -0400
- To: w3c-sgml-wg@w3.org
Before we get too far into a discussion of entities, I'd like to take a second to reflect on XML's dual parentage. (risking Len's ire =) ) The SGML Way: In the SGML world, entity referncing and document parsing are intertwined, though separate. A document cannot be considered "valid" until its entities have been resolved (especially its external text entities!. Entities are declared (and theoretically resolved) in the DTD or DTD subset by the entity manager and used by the parser to create the ESIS. In all but a few SGML systems, all document content reuse is done through this mechanism. The Web Way: In the Web world (especially in the HTML world), entities (or "objects") are, conversely, resolved by the application AFTER parsing. So, the parser parses the document and returns the ESIS to the application. The application starts to process it and fetches (perhaps through an entity manager) any objects it needs to complete that task. In reality, "transclusion" or "inclusion" or "fragment inclusion" is just a special case of linking. The standardized mechanism for including HTML content is the same as for including JAVA or Active-X content: <OBJECT>. In other words, in HTML, document transcusion is just a special case of linking. The parser doesn't know or care about it. The Heresy: Do we really need parse-time entities in XML? What do they "buy?" In a networked environment, the decision to resolve entities or not should be entirely left up to the application (not the parser) because only the application knows which entities it "needs", and the cost of resolving entities it does not need is quite high. Furthermore, since the number of entity-resolution failures will be quite high (relatively speaking) going over the Internet, the application should be able to choose which entities are strictly need and which may be absent. In other words, I am proposing that we should not let entities affect the parsing of their containing documents. Every XML document would be validated without regard to the content or existance of its sub-documents. Further, the content of sub-documents should not affect the parse-tree of the parent document in any way. They would all be "opaque" to the containing document at the parser level. At the application level, of course, it might be invalid to transclude a footnote as if it were a chapter, but that's the same as if you hyperlinked to a chapter as if it were a footnote. But HyTime (and XTime =) ) has/will have facilities for specifying those constraints at the "link manager" level. If we can agree that the parser doesn't need to care about fragments/entities/objects, then we can wait until the spring to talk about them. Paul Prescod
Received on Saturday, 21 September 1996 16:18:42 UTC