- From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
- Date: Wed, 07 May 97 18:23:45 CDT
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
On Wed, 7 May 1997 01:48:06 -0400 Paul Prescod said: >You forgot one tolerant point and I would like you to address it please. >It falls naturally out of the draconian: > >Tim Bray wrote: >> I think I am speaking fairly for the draconians when I say that from >> our point of view, it works because >> - well-formedness is so easy that it isn't a significant burden on >> anyone, > >Well-formedness is such a small step on the way to a useful document >that it isn't of particular *value* to anyone: so why all the fuss? How >many applications that will be able to read a well-formed XML document >and do something useful with it? Well, it seems to me that the fuss is not entirely irrational. (I was in the minority, but I think the draconians were fairly clear-headed on this point.) It is very important, for the long-term health of XML, that with regard to error tolerance and error detection we adopt something more like the culture of SGML (there is a spec, and if your document has errors, you better fix them pronto because otherwise your software may break and you will in any case be laughed to scorn and possibly ridden out of the next SGML 'XX conference on a rail) than like the culture of HTML as it has developed (where error recovery is in some cases just another name for buggy software not noticing the errors). In the case of SGML, the banner of Validity has helped everyone a lot. (It has a down side, too, but on balance I'd say the introduction of the notion of formal validity is a major advance in document processing; one of the ways SGML is a step forward vis-a-vis GML and Scribe and so on.) Note that most everything Paul says about how little WF buys for us is also true of validity. I can have a perfectly valid SGML document that is ugly as sin, abuses its tags in a way that would make any decent document blush, mixes incompatible values of attributes which ought to be compatible with each other, and points to locations that don't exist, in documents whose file names are mistyped, on nodes that are no longer part of the network and maybe never were. A valid document, that is, is not the same as a correct one. And yet, we don't regard validity checking as pointless. And so SGML documents *tend* to be cleaner than non-SGML documents, even though many types of dirt are not detected by validation. Without some minimal standard that works for XML the way validity works for SGML ('works' in this sense of encouraging a certain kind of culture among users and programmers), we are indeed in serious danger of launching another race to the bottom, and replicating the current situation in HTML, where people can say with a straight face that it doesn't matter what's in the HTML spec or DTD, all that counts is whether the major browsers handle a given construct. Requiring XML processors to go on strike when they encounter ill-formed input is an important symbolic gesture that says "Come to XML, all ye who labor and are tired of dirty data." It draws a line in the sand and delimits a class of data so hopelessly messed up that there is nothing useful to be done with it but issue error messages. Where we draw the line matters, to be sure. But *that* we draw such a line may matter, in the long run, even more. Because it establishes that formal correctness counts, too, not just pretty pictures and how it looks on a 21-inch monitor. >We can be totally draconian when it comes to well-formedness and the >Web will be just as messy, nasty a place tomorrow Here I disagree. If there are 10 flies in the soup today, then insisting on well-formedness may not give us fly-free soup tomorrow. But I think going from ten to five -- or even seven -- is well worth everyone's while. At the very least, it calls everyone's attention to the notions of fly-in-soup, and the notion of screens that keep certain kinds of fly *out* of the soup. And on the whole, I think we can safely say that that is progress. -C. M. Sperberg-McQueen
Received on Wednesday, 7 May 1997 19:53:49 UTC