- From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
- Date: Sat, 26 Apr 97 18:15:25 CDT
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Summary: we cannot in practice require that XML processors ignore or discard data following the first detected error; as a result we should not try to do so, even if doing so were a good idea (which it is not). Tim has suggested, and a number of people have supported the idea, that after detecting a violation of a well-formedness constraint an XML processor be required to stop sending information to the downstream application. A number of people have already argued against this idea, using arguments I agree with and won't repeat. Here I just want to point out that (with a single exception) neither Tim nor anyone else has made any argument that, even if taken at face value, would lead to the conclusion that this is a good idea. The sole exception is Tim's rhetorical question "can any application hope to do anything useful with ill-formed data?" to which the only realistic answer is 'Yes, of course, many applications do hope to do useful things with ill-formed data, and some of them are right.' James gave this answer, and no one has attempted any refutation, so I won't say any more about it. Otherwise, the arguments of the Draconian camp are all centered around the unquestioned observations that - there are applications where ill-formed data is useless or worse than useless, and where ill-formedness must be detected - by their unwillingness to issue error messages, and their determination to provide attractive displays even of badly ill-formed documents, HTML browser makers have made their own lives very difficult Neither of these observations supports a blanket ban on error recovery by XML processors. Tim and others have, in the meantime, conceded that some applications can usefully attempt error recovery, and hope to salvage the Draconian Rule by suggesting that such applications should use programs which aren't 'XML procesors' in the strict sense. This amounts to saying "implementors can pick and choose which parts of XML to implement, and can keep themselves blameless even when flouting basic requirements of the spec, if only they call themselves XML Handlers or some other name instead of 'XML processors'". I cannot think of a worse approach to the problem of ensuring uniform error reporting by XML software. Whether it's possible to prevent vendors from attempting to compete on the basis of the quality of their error recovery, I don't know. I doubt it. I also don't see why it's necessary to prevent it. It's not the error recovery in HTML browsers that has led HTML to its current state, but the *silence* of that error recovery. We complain that most authors validate by looking at the document onscreen -- what else do we want? I do that myself, in SGML. Yes, I do check the return code from the parser, but I also check to see that everything looks all right -- if it doesn't, the validity of the document is deceptively hiding errors in the tagging or in the style sheet. The only thing wrong with checking by visual inspection is that in most HTML browsers it's not a sufficient check. An author who does want to find errors can't do so with the software at hand, because the browser won't report them. So I agree with whoever it was who said that the real problem is the absence of an error-reporting mode in HTML browsers. If this is true, then what we need to do is to ensure that XML processors *always* allow the user to request error reports, even if the software recovers from the errors in question. That way, the user who says "program X displays my data all right, why don't you?" can be told "look, even program X says your document is ill-formed: look at it with error-checking turned on!" As it happens, the xml-lang spec already requires this. I don't think it can realistically or usefully require more, except perhaps that it should also explicitly require that the processor notify any down-stream app, as well as the 'end' user if any. I don't think it should require less. If we want the culture of XML usage to differ from that of HTML, we need to ensure that implementors pay attention to the requirement that they report reportable errors unless the user says not to. We can do that by complaining unmercifully about any implementation that doesn't provide error reporting, and by pointing out -- correctly -- that it's not a conforming implementation of XML. -C. M. Sperberg-McQueen
Received on Monday, 28 April 1997 05:05:07 UTC