- From: Murray Altheim <altheim@eng.sun.com>
- Date: Tue, 23 Nov 1999 15:42:11 -0800
- To: Daniel Hiester <alatus@earthlink.net>
- CC: www-html@w3.org
Daniel Hiester wrote: > > I've been interested in the development of XML / XHTML for quite some time, > and this thread raises one question for me... may or may not be on-topic > (sorry if it's off) I don't think anything to do with HTML on a list entitled 'www-html' is off-topic. But I hope you can pardon my rather wandering reply. > Does this represent, in the _opinion_ of people in the working group, > "the death of HTML?" [While I think many on the HTML WG might agree with the what I'll say here, I can only speak for myself. Actually, I'll be speaking on this topic at Markup Technologies '99 in Philadelpha, so I suppose I have a few _opinions_...] If HTML is to die, it'll die of its own accord, not through any plan of action or negligence on the part of the HTML working group. Recently on a mailing list it was pointed out that internal subsets aren't "allowed" in HTML, hence the normative reference to SGML should be removed from the specification. I can only laugh. The HTML 4.0 Specification is formally designated as an application of SGML (and apart from arguing about the internal subsets issue, its markup is syntactically SGML), but I don't need to tell anyone that in usage, HTML is not only far from SGML, it's far from being simply a markup language. It incorporates all manner of kludgey extensions, several different kinds of incompatible scripting (often encapsulated in SGML comments which are stripped in many true SGML applications), commonly-used features whose application or conformance boundaries have never been formalized in any specification (even though the markup to 'support' it may have, eg., frames), variant, non-compliant browser implementations, a yet-to-ever-be-implemented stylesheet syntax that keeps growing like mold, inline styling and scripting that dynamically alters the document depending on user agent ability/inability, etc. IOW, HTML's burgeoning "application conventions" and supplemental "features" are choking it to death. Try turning off Javascript and images and browsing the web for awhile. Especially sites that rely on very proprietary new "XML" features (that aren't XML at all). And while HTML 4.0 is in theory Unicode-based, internationalized and WAI compliant, it's in practice (both in applications and in documents) almost none of those things. And so it works some people, not for others. And as the Web moves increasingly into the world of small device browsers, that fragmentation will only increase. But this is old news to many. "HTML" documents in theory should be viewable on any browser that implements the specification, but unfortunately HTML 4.0's spec allows for such wide variance and requires support for CSS (itself an impossibility) that I hardly blame MS and NS for not having compliant browsers. The dream of document interoperability died a long time ago, probably somewhere between HTML 2.0 and 3.2. What we have now is Frankenstein's markup. > What I mean is, does the effort of the working group to create / promote > XHTML represent an attempt to bring to an end a winding, twisting history of > the SGML-based HTML language, and start a brand new era? Is SGML-based HTML > too limited to continue to grow in a rate comparable with the growth of > demand being placed upon it? The short answers: Yes. No. Because XML must be *at the very least* well-formed (simply to pass thru an XML processor), we hope that this level of compliance will set a higher threshold for markup quality that will enable better baseline processing. For example, well-formed XML can be fairly faithfully transformed using an XSLT stylesheet into other forms, like altered for use with small devices. And because the biggest problem with current HTML parsers is the enormous amount of error-handling code (most HTML documents are one big error, IMO), the move to XML will decrease the parser size dramatically. In XML there is no error handling code: it spits out the document. And before anyone throws a tantrum, remember that this is the error behaviour of most word processors and other applications on encountering an error in a binary data file. One of the twisting forces of "nature" that has caused so many problems for HTML is that every company wanting to "innovate" (not just Microsoft and Netscape) has pushed all manner of ideas into HTML, so it long ago lost it's uber-ML appeal. I have been an advocate for modularization of HTML for many years (since probably about early 1996, see [MHTML]), but mostly in order to *subset* it, while I see around me the desire to add even more new features. What we need is more simplicity, not more features. The modularization of HTML follows the mold of some already-modularized SGML languages like DocBook and TEI. There was nothing particularly limiting in SGML (for XML is just a subset of SGML and is therefore actually less expressive), but with the mindset of fixed markup languages. Some believe that well-formedness alone or namespaces will solve this problem, whereas I believe (as I'm sure some are tired of hearing) that they're completely unsuitable for the task of creating hybrid-doctype documents, striking the balance between interoperability and openness too far to the latter. I agree with Frank that the combination of WF markup and an XSL stylesheet will provide interoperable presentation. But beyond this, due to politics, inertia and entrenchment, I'm more and more thinking the W3C incapable of remedying this problem. Their solutions are getting ever more complicated, not less so, to the point where XML Schemas are so complicated that creating a moderately complicated one (say even as complicated as HTML, which isn't complicated) is an almost impossible task for non-experts. People made this statement about DTDs; wait until they get a load of schemas. Perhaps we need to be travelling down a different road entirely. As before there are those wishing to differentiate themselves in the marketplace by creating new proprietary functionality; this time they're actively involved in getting a stamp of approval from the W3C by participating in W3C working groups. And of course those who ignore the W3C when they don't get their way. I've seen companies stamp their feet like a three year old. Call me an idealist (for I surely am one), but I hope that out of this confusion arises the idea that a web of documents that all people (in any country, in any economic class, with varying computing and personal abilities, on any device) is a goal that we all should strive towards. That the marketplace will lose to the "community" who demand the ability to read documents. That people themselves will stop trying to write clever web pages and concentrate instead on ones that everyone can read. The ability to create many varieties of interoperable markup languages based on a common framework (XML and its family of specs, XLink, XSL, etc.) relies on people abandoning proprietary markup (and in this I include a wide array of non-XML Web "features" such as CSS, JavaScript, the current HTML linking syntax, etc.) and begin using truly interoperable markup. A new baseline for interoperability, a new era based on XML, XLink and XSL. > I guess this is where philosophy meets technicality... sorry if I'm > off-topic... I'm just very, very curious... I'm pretty sure that I'd agree > with W3C officials, no matter which stance they take... I'm just all too > intrested in really knowing what their stance is. Well, if we're going to venture into philosophy, let me do so as well. I find your statement that you'd "agree with W3C officials no matter what stance they take" very curious. Why? They're by no means gods, nor even have demonstrably more expertise than the membership of the W3C or others in industry who've been working with markup since the 1970's. I realize the propensity of people to look for heroes, and some are often quite willing to bask in that limelight. I was rather perturbed while listening to a recent public radio interview with a "W3C official" by his complete negligence in mentioning the role the IETF and NSCA played in the development of the Web. You'd almost think the Internet didn't exist prior to the Web. No mention of gopher, of course. But I was not surprised. One of the problems I see with the W3C is the same as often occurs when any new technology arises: people get the feeling it was invented by one or two people, when in reality it's often an entire scientific community (Darwin, Edison and the Wright brothers come to mind). Innovation rests on the shoulders of what comes before, and the Web is no different. There have been many hundreds of people involved in the evolution of the Web, and many have been active participants in producing specifications, developing applications and trying out new related technologies. As we're seeing with the enormous surge in popularity of Linux, it's not the opinions of a few people that matter, but the community at large. Rather than concentrate on the opinions of others (which are often wrong, myself included), think what you'd like to contribute to the community. Murray [MHTML] http://www.altheim.com/specs/mehitabel/ ........................................................................... Murray Altheim, SGML Grease Monkey <mailto:altheim@eng.sun.com> Member of Technical Staff, Tools Development & Support Sun Microsystems, 901 San Antonio Rd., UMPK17-102, Palo Alto, CA 94303-4900 the honey bee is sad and cross and wicked as a weasel and when she perches on you boss she leaves a little measle -- archy
Received on Tuesday, 23 November 1999 18:41:27 UTC