- From: Tim Bray <tbray@textuality.com>
- Date: Wed, 07 May 1997 11:32:50 -0700
- To: w3c-sgml-wg@w3.org
The ERB met on May 7th. All members were present in person or by proxy. The chief subject under discussion was error handling; I have been asked to report on the discussion and results. The arguments on both sides have been exhaustively covered, and I won't repeat them. There were, however, a few new issues that came up in the course of the meeting. 1. WF-ness may not be as easy to check as I have been claiming - getting the grammar right for a complex ATTLIST inside an INCLUDed marked section is nontrivial. 2. We have a strong political reality to deal with here in that for the first time, the big browser manufacturers have noticed XML and have together made a strong request: that error-handling be completely deterministic, and that browsers not compete on the basis of excellence in handling mangled documents. It was observed that if they wanted to do this, they could just do it; but then pointed out that this is exactly why standards exist - to codify the desired practices shared between competitors. In any case, if we want XML to succeed on the Web, it will be difficult to throw the first serious request from M & N back in their face. 3. In fact, everyone on the ERB substantially agrees with M&N's goal, in that we do not, ever, want an XML user-agent to encounter a WF error and proceed as though everything were OK. Our disagreements centre on how to use the spec machinery to achieve this. 4. We're not worried that XML editors will silently recover from errors, because they exist precisely to create and manipulate correct content and to fix incorrect content. XML processors that are "read-only" are the things that have the problem, because users have no incentive to prefer error-free documents. 5. We considered an alternative proposal, which makes two major changes to the XML spec by defining the concept of an XML-conformant application, and the concept of a human user. This proposal would require an XML-conformant application, when confronted with a WF error, to refuse to proceed until a human user had been notified of the error and explicitly authorized error recovery. After some discussion, this proposal failed to win majority support - concerns included - the radical changes to the spec - the fact that much parsing code is operating in multithreaded mode at a very low level, and it may not be tractable to have to check for the presence of a human - this seems to compromise a design goal of XML, that processors be lightweight and easy to send across the Net, because they will all start to carry around user-interaction and error-recovery code for competitive reasons - it is not clear that the modal-approval model is achievable across the range of user interfaces where XML will likely be deployed However, this proposal did get serious consideration, and quite likely would have attracted significant numbers of votes from the Tolerants in the crowd. 6. If it turns out that there are common classes of WF errors that are bedeviling users, we should be willing to fix the language to address the problem. 7. There are some detailed operational concerns about the draconian model. First, it allows processors to feed parsed info to the app up to the point of error; but is this required, i.e. can a processor refuse to cough up a single byte because the doc is non-WF? Second, it is important that the processor be able to feed the app raw un-parsed text to aid in error repair - given that the processor knows where he is in the entity tree, it's much easier for the processor to do this than the app - and this should probably include portions of the doc *before* the error. 8. It was pointed out that if adopt the draconian policy, and then at some later point decide that error recovery should be allowed in some or all circumstances, we can relax it. The reverse is not perceived to be true. So after all this, the vote: The question is [note special terms 'must', and 'may']: 1. The XML-lang spec should be modified (probably in the conformance section) to state that for well-formed documents, an XML processor must make available to the application, at a minimum, the character data extracted from among the markup in the document, and a description of the logical document structure expressed by the markup. 2. The XML-lang spec should be modified to state: When an XML processor encounters a violation of a well-formedness constraint, it must report this error to the application. It may continue processing the data to search for further errors, and report such errors to the application. In order to support correction of errors, it may make the unprocessed text from the document, with intermingled character data and markup, available to the application. Once such a violation is detected, however, the processor must not continue the process, described in [ref. to language in point 1], of passing character data extracted from markup, and description of the logical document structure expressed by the markup, to the application. Yes: Bosak, Bray, DeRose, Magliery, Maler, Paoli, Wood* No: Clark, Hollander, Kimber, Sperberg-McQueen (* Lauren Wood was substituting for Peter Sharpe, with the approval of the Chair and ERB) On a related point, the ERB agreed to put some application notes in the spec covering the points raised in items 4 and 7 above. Cheers, Tim Bray tbray@textuality.com http://www.textuality.com/ +1-604-708-9592
Received on Wednesday, 7 May 1997 14:34:34 UTC