- From: Tim Bray <tbray@textuality.com>
- Date: Wed, 06 Nov 1996 12:30:23 -0800
- To: W3C-SGML-WG@w3.org
In a recent series of mail votes and meetings, the ERB has resolved several XML design issues. Under pressure of time, we moved very rapidly and votes may not have been fully and exactly recorded where the sense of the ERB on some issue became quickly obvious. It is possible that ERB members may wish to correct their reported votes. As always, accompanying rationales, where present, have not been reviewed by the ERB and may be subject to correction. ********** [No item number] Decided unanimously to change PIC for XML to be '?>'. This will allow a lot of things to fit into PI's that currently can't (most notably some proposed server-side scripting languages). ********** A.8, B.7 XML will have INCLUDE/IGNORE marked sections in DTD's Passed, Bray and Paoli dissenting. ********** A.20' XML will change the COM delimiter from '--' to some other string, to minimize user errors. (Candidates: !!, /*, //, **, ??, ;;, ~~. !?, ?!, (), [], others ...) Defeated, Sperberg-McQueen voting in favor ********** A.22 XML will have no CONREF attributes (11.3.3, 7.3, 7.9.4.4). Passed (no CONREF), Kimber and Maler dissenting ********** B.9' Should XML require system and public identifiers to be FORMAL (13.5)? This had actually become a discussion of whether to allow the <url> formulation in front of external identifiers, which must be URL's in XML. Decided, DeRose, Kimber, Maler, and Sperberg-McQueen dissenting, not to allow the <url> prefix. ********** C.10 Should XML allow nondeterministic content models (11.2.4.3)? Voted (Bray, Paoli, and Sharpe dissenting) to retain SGML's restriction in this area. Rationale: Existing SGML tools, for example the SP family, have this rule wired deeply into their logic, and those who wish to use these tools on XML documents won't be able to if they have non-deterministic content models. ********** C.14 Should XML allow more than one enumerated type (name-group declared value) to contain the same possible value (11.3.3)? Voted unanimously to remove SGML's restriction in this area. Rationale: This is incompatible with 8879, but there is every expectation that WG8 will fix this problem soon; furthermore, making this change is not expected to cause serious inconvenience to existing SGML products, whereas the rule is a very serious inconvenience to users of XML and authors of XML software. ********** D.2 Should XML provide shorthand ways of summarizing the salient points of a document's DTD? Discussion: This turned out to be one of the hardest problems the ERB dealt with, and the key issue became that of EMPTY elements. Remember that in a previous decision we had agreed to recommend the <e/> syntax, but accept the 8879 syntax. Here are some of the sticky parts: - for a well-formed document to be network-usable, it is important that EMPTY element declarations be available in the document entity so that a browser can be guaranteed of not having to fetch a DTD over the network before starting to parse. - requiring all the <!element foo EMPTY> declarations to appear in the internal subset could lead to a situation where documents were valid but not well-formed. - Also, it would require those using big existing DTD's to hash them around to make sure these declarations appeared in the internal subset. - As the WG pointed out, supporting two forms of EMPTY is bad design. - The proposal for a PI that summarized the empty elements worked around a few of these problems, but introduced a new syntax and was a possible source of inconsistencies. - Falling back to a position allowing only the <e/> syntax solves all the problems cleanly, but makes it *impossible* for a valid HTML document to be XML - several on the ERB felt this was political suicide. Bearing all this in mind, the ERB voted, Maler dissenting, that: - XML support only the <e/> syntax for EMPTY elements - this means that all the language in the spec about "undistinguished" EMPTY elements can come out. - In order to make it possible that a valid HTML document can be a valid XML document, the XML spec will state that XML processors, when they are processing HTML documents, should recognize, in a built-in way, that the elements declared as EMPTY for HTML 3.2 (BR, HR, IMG, etc.) are empty even without syntactic indication. The manner in which an XML processor is to decide whether a document is HTML is not constrained by the spec. Rationale: For technical reasons, requiring and allowing only <e/> is a big winner. However, many of us, who anticipate an uphill struggle selling XML to web-heads felt that the marketing advantage in making it possible for HTML documents to be valid, and being able to say "XML processors can read HTML", were impossible to give up. In opposition, Eve Maler in particular felt it was unconscionable to kowtow to the requirements of one particular DTD. The ERB acknowledged that allowing <BR>, etc., does *not* enable to XML to grandfather, on a large scale, the existing inventory of XML; simply to state that (at least some normalized) HTML documents can be read by XML processors. ********** D.3 Should XML specify short-hand element declaration keywords (e.g. %ANY-ELEMENT;) for element content in which any element in the DTD is legal (same as ANY, but element not mixed content)? Defeated, Sperberg-McQueen voting in favor. Cheers, Tim Bray tbray@textuality.com http://www.textuality.com/ +1-604-488-1167
Received on Wednesday, 6 November 1996 15:30:52 UTC