- From: Dave J Woolley <DJW@bts.co.uk>
- Date: Wed, 26 Jul 2000 19:52:16 +0100
- To: www-html@w3.org
> From: Jan Roland Eriksson [SMTP:jrexon@newsguy.com] > > "The HTML 2.0 specification ([RFC1866]) observes that many > HTML 2.0 user agents assume that a document that does not > begin with a document type declaration refers to the > HTML 2.0 specification. As experience shows that this is a > poor assumption, the current specification does not recommend > this behavior." > [DJW:] That's just a statement of the de facto situation that very few documents have valid doctypes; many have none, many from a year or two ago, have one that equates to HTML 2.0, but are authored to something like HTML 4 Transitional. I even looked at the web site of the one of editors of a recent W3C document, and that of their employers++. The latter had an incorrectly capitalised HTML 4.0 (Strict) doctype, but was actually authored in invalid HTML 4.0 Transitional. One of the former had the doctype after the head section, and another failed to honour its (XHTML) doctype. > And there's no problem what so ever to design an excellent stylesheet > suggestion, using contextual selectors, for a strict HTML2 doc. [DJW:] As presentation is outside the scope of HTML 2 and LINK is open ended, I have no qualms about adding an external style sheet to HTML 2 documents! > Don't use "doctype-sniffing" for the wrong purpose, doing that > will only create a new set of problems that we need to discuss > again some years from now. > [DJW:] I can't think of anywhere where a conforming HTML 4 parser would mis-parse a conforming HMTL 2 document, although I can think of one case (radio buttons) where there could be a significant semantic difference between it and an HTML 3.2 document. Especially given that popular authoring tools mislabelled HTML 4 as HTML 2, I'd think it naive to expect content to be correctly labelled when this mattered, or browsers to care about backward compatible behaviour. The problem comes with non-conforming documents authored for HTML 2 etc.; things like comment syntax have been enforced more strictly in later versions (Lynx has two different broken comment parsing modes!) and tag soup structures make less sense. However, I think that that is a commercial issue for browser writers (who encouraged the problems in the first place). It would probably be much better for a browser to use heuristics to detect the need for "tag soup" parsing and broken comment rules, either after detecting an error on a strict parse, or, in spite of a good parse, because of, for example, multi-line comments containing apparent tags, entities preceding = signs in hrefs, etc. Whilst I don't particularly like the idea of them applying such rules for a good parse, and suggest it should be possible to disable them, I think they will be neccessary for a long time. I don't think it is the job of standards to make rulings on this, because that will just discredit the standards when they are not implemented, but I think the standards documents should point out common abuses that might need error recovery, and should advise that browsers indicate on the status bar, or equivalent, that error recovery had to be used, so that users become more aware of bad HTML. ++ I think it is fairly well known that the people in companies that get involved with standards often have little control over the marketing people. (Actually, none of the company sites of recent contributors and some major W3C members pass the W3C validator, mine included; my home one does, as does W3C's.) -- --------------------------- DISCLAIMER --------------------------------- Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of BTS.
Received on Wednesday, 26 July 2000 14:52:31 UTC