- From: Michael Sperberg-McQueen <U35395@UICVM.CC.UIC.EDU>
- Date: Wed, 02 Oct 96 14:29:14 CDT
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
On Wed, 2 Oct 1996 14:37:19 -0400 Eliot Kimber said: >At 01:44 PM 10/2/96 -0400, David G. Durand" (David G. Durand wrote: > >>My argument against quoting is that SGML compatibility should not >>be _more_ important than user utility (and familiarity is a >>significant component of utility for most busy people who aren't >>toolsmiths). > >David has, I think, crisply defined the key to this issue. Either Agreed. Except that he concedes too much. Familiarity is important for getting the toolsmiths on board, too. >XML is completely compatible with SGML or it isn't. A review of the >ERB's stated principles shows that SGML compatibility gets higher >priority over ease of entry. Yes. Ease of entry, however, is not quite what David mentions, and not quite the same as user utility and user acceptance. So I don't think the goals statement actually provides an unambiguous answer to this question, even if we believe that we need absolutely full compatibility or nothing. (I for one think almost-compatible is a lot better than nothing! XML with incompatible RE rules is, for example, much better than recommending that the world switch to Word Perfect binary format....) >Whether this priority is the correct one or whether it should apply >in this case is still open for debate, but I think it's clear, as >James pointed out what seems like years ago that the only real >solution to the RE problem that preserves SGML compatibility is to >eliminate mixed content, which means quoting data. I don't think that was quite what James said. Also, I don't think it's quite true beyond all cavil or doubt. James pointed out that prohibiting certain things in mixed content (things like subelements, comments, and PIs) would render the rules trivial. He also pointed out several other possible approaches which would preserve strict compatibility, as well as some that would come close. Quoting is one approach to trying to make the agonizingly painful restrictions on mixed content less agonizing and less painful. It is an extraordinarily clever use of shortref, and my hat is off to Charles for imagining it. The dean of SGML hackers (in the non-pejorative sense!) can still show the rest of us a thing or two. But I expect that it would be the death of XML to require it. It would be far better, I think, to adopt James's compromise solution of white-space stripping and RE-merging, which also achieves SGML compatibility in all non-pathological cases, or to treat RE as we treat any other white space (significant outside of angle brackets, a separator inside of angle brackets, -- roughly speaking), which breaks SGML compatibility in most non-pathological cases, but is very simple to use and understand, and which would *never* make a difference in any SGML application I've ever used in real life. (I.e. all the application code I actually work with in practice is already written to assume it's got to recover when it's passed an unwanted RE or two. Full disclosure requires that I admit I do have some code that does terrible things when confronted with leading blanks at times it doesn't expect them. But that's not relevant here, because 8879 doesn't deal with that. James's compromise solution, to its credit, does.) SGML already *has* delimiters between markup and data. Do we really need a second set of delimiters for *white space*, for heaven's sake? Say it ain't so! >I certainly agree that quoting data will make authoring *by hand* >more difficult (I hated it the first time I tried it), and we do >have to be sensitive to the marketing implications of requiring it, >but I feel very strongly that the cost of not having SGML >compatibility in this case is much greater than the cost of >authoring. We agree that the tradeoff involves authoring vs. strict compatibility, but I think it's more than authoring. The closer we come to making XML documents look familiar to today's users of SGML and SGML applications, including HTML, the greater the acceptance of XML not only among users but also among toolsmiths. (It's for this reason that I think Bill Lindsey's very smart NET tricks, though like Charles's quoting proposal they are a tour de force of ingenuity, are not the politically savvy route for XML to take. I read Scheme books in the evenings, and in the mornings, with a sigh, write code in C.) We seem to disagree primarily in assessing the relative costs. It's an empirical question, but it's going to be hard to test both possibilities, since if we don't get it right the first time, no one in their right mind will pay attention to us the second time. In this case, I think the cost of the quote proposal, like the NET tricks proposal, is massive resistance (= massive indifference) on the part of tool makers. We need a workable solution to the RE-simplicity / 8879-conformance tradeoff. There are several on the table I could live with; the white-space-stripping + RE-merger proposal that came out of the ERB last week seems the best to me, but I'm not dogmatic about it. I *am* dogmatic on the proposals to forbid comments and PIs inside of mixed content: I'd rather live with our current situation (full 8879 or nothing) than explain to incredulous users why comments are not allowed everywhere. And if I'm right about the user reaction to the quoting proposal, adopting it will mean we do continue to live with the current situation. Quoting, in short, seems to me to fail the Stoopid test. And failing the Stoopid test is not the way to call users and toolsmiths to the banner of XML as a Better Way for a Better Net. Michael Sperberg-McQueen &disclaimer; &serious-disclaimer; &really-serious-disclaimer; &no-really-i-mean-it-disclaimer;
Received on Wednesday, 2 October 1996 16:16:43 UTC