- From: John Boyer <boyerj@ca.ibm.com>
- Date: Tue, 2 Oct 2007 08:22:35 -0700
- To: xml-editor@w3.org
- Message-ID: <OFE07CB2CB.98465F0A-ON88257368.00507660-88257368.005477F2@ca.ibm.com>
Dear Editors, The syntax for extParsedEnt should allow a prelude to "content" that allows declaration of entities using a subset of the DTD notation. An XML document A.xml may declare an entity 'b' and associate it with a SYSTEM literal B.xml. A.xml may then use an entity reference &b; to include the content of B.xml. Similarly, B.xml may use an entity reference &c; to include the content of C.xml. However, B.xml is unable to *declare* the ENTITY 'c' and associate it with C.xml because the syntax rule for extParsedEnt is just "TextDecl? content". As a result, anything that one may want to include by entity reference in B.xml or its descendants must be declared in A.xml, which hobbles the componentization of the XML. In practical terms, this means that if one wants to write a 'book' that declares entities for chapters and includes those chapters by entity reference, the chapters are unable to declare and include their sections, and the sections will be unable to declare and include their subsections. Perhaps the simplest syntactic change that would introduce no significant problems would be to introduce a modified TextDecl for use in the extParsedEnt product, like this: TextDeclParsedEnt ::= '<?xml' VersionInfo? (EncodingDecl|S) (EntityDecl | DeclSep)* S? '?>' extParsedEnt ::= TextDeclParsedEnt? content One upside to this approach is that there would be no confusion between the whitespace separating entity declarations and the character content of the entity. Another upside is that it does not disturb the definition of TextDecl, which is also used by extSubset. The only downside is that one must create the TextDeclParsedEnt in order to declare entities. However, note that EncodingDecl was changed to optional so that only the leading <?xml and whitespace must be written before making entity declarations. In hindsight, the following additional observations might be made. First, instead of the leading '<?xml', a leading and required declaration of <?xmlentity might have been useful because of the differences between an external entity and a well-formed document. Tools are having trouble deciding which WFCs to impose on a file containing XML. Or, put another way, although the current design is useful because it allows arbitrary content, not just a well-formed XML document, to be included by entity reference, the downside is that it is not possible to include a well-formed XML document into another document using a simple entity declaration and reference. This means it is not easy to create an aggregation of well-formed XML created by others. In a web 2.0 world, that hurts. Thanks for listening! John M. Boyer, Ph.D. STSM: Lotus Forms Architect and Researcher Chair, W3C Forms Working Group Workplace, Portal and Collaboration Software IBM Victoria Software Lab E-Mail: boyerj@ca.ibm.com Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
Received on Tuesday, 2 October 2007 15:22:56 UTC