- From: Jon Bosak <bosak@atlantic-83.Eng.Sun.COM>
- Date: Thu, 7 Nov 1996 08:17:00 -0800
- To: W3C-SGML-WG@w3.org
- CC: bosak@atlantic-83.Eng
[Paul Prescod:] | > Paul [Grosso] asked: | > Frankly, this HTML compatibility thing is a total waste of time. | > To be at all HTML comptibile, you have to cope with | > <UL COMPACT> | > <li>first item | > <li>second item | > <b><li>bold item</b></li> | > <hr>did you know hr isn't allowd here?</hr> | > <li><img src=http://bet/you/need/quotes/in/sgml.gif> | > </UL> | > | > Yes, this "works" in Netscape. Say it isn't calid all you like. | > shout until you're blue in the face. But reading this is what | > HTML compatibility is about today. No. That's not what we're trying to accomplish. | I think that the ERB is looking at it from the opposite point of view. They | are trying to define XML so that an HTML-like subset of XML can be | "understood" by existing Web browsers. Right, though the emphasis is really more the other way around. The idea is *not* to grandfather in existing HTML; in fact, the spec bars virtually all existing HTML documents just because of the end-tag requirement. Rather, the idea is to make it *possible* to create new documents that are both valid HTML (according to the HTML 3.2 spec) and valid XML. Such documents would look a lot different from the typical HTML page today, but they would have the advantage of working in both the HTML and XML application environments. So HTML users can run a normalizer on their existing HTML document bases and get documents that will continue to be viewable by current HTML browsers but will also work in new XML browsers. Without making the one extremely limited concession to existing HTML empty elements that came out of our final round of deliberations, this would not be possible. Please note that there is a huge difference between a mechanism for defining "unqualified" empty elements on an ad-hoc basis and a mechanism that simply recognizes a small, closed set of specific strings ("<BR>", "<IMG", etc.). The first approach is not only far more difficult to implement if you're starting from scratch (which is what the implementation requirement for XML is all about), but it allows people to go on forever using a syntax we want desperately to get away from. The second approach recognizes a logistical reality by providing an absolutely minimal bridge across the gap from HTML to SGML without putting in place anything that would encourage a practice that we are trying to eliminate. Note also that this strategy does not discriminate against the existing SGML document base. There are probably as many existing SGML documents that will work unchanged in an XML environment as there are HTML documents. My Shakespeare and Religious Works collections are valid XML just as they stand; it would be hard to point to existing HTML collections of similar size that could make that claim. Jon
Received on Thursday, 7 November 1996 11:19:03 UTC