- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 24 Feb 2006 22:57:33 +0000 (UTC)
On Wed, 6 Apr 2005, Olav Junker Kj??r wrote: > > An innocent question (no flamewar intended): What is the benefit of > having HTML defined as an application of SGML ? You could use SGML tools with it, including well-established validator tools; the parsing model (for compliant documents) is very clear; SGML has a lot of abbreviation syntaxes that make it quick to write markup, it means we're not reinventing the wheel. Unfortunately, in practice, nobody uses SGML tools, validators are unable to catch a number of important (computer-checkable) conformance problems, the parsing model doesn't handle non-compliant documents and the majority of documents are non-compliant, the abbreviation syntaxes are extremely complicated and largely unimplemented, and incompatible with existing content, and the wheel was already reinvented. On Wed, 6 Apr 2005, Olav Junker Kj??r wrote: > > The problem is that validators use the term "valid" in a very limited > sense, but web authors without a through understanding of DTD-validation > would naturally assume that "valid" would mean "valid according to the > spec". Indeed; the term "valid" in an XML/SGML context is used to mean a specific subset of "conformant", but most users don't know this and assume it means "fully conformant". I've tried to work around this in the spec. On Wed, 6 Apr 2005, Olav Junker Kj??r wrote: > > There are three types of conformance criteria: > (1) Criteria that can be expressed in a DTD > (2) Criteria that cannot be expressed by a DTD, but can still be checked by a > machine. > (3) Criteria that can only be checked by a human. > > A conformance checker must check (1) and (2). A simple validator which only > checks (1) is therefore not conformant. I've put this in the spec, I hope that's ok. On Thu, 7 Apr 2005, [ISO-8859-1] Olav Junker Kj?r wrote: > > A DTD or schema in the spec would be redundant anyway, since it would > only echo what is described in prose. Indeed. > DTD validation would be almost useless in the case of WF2, except > perhaps for catching spelling errors in attribute names. A schema in a > sufficiently expressive language would go along way, though. For WF2 it may be far enough, I'm not sure. For HTML5 I'm pretty sure no Schema language (short of a turing-complete one) is expressive enough. > I notice that <input type="text" src="some url" checked="true"> is valid > according to the schema for XHTML. Indeed. It'll probably be conformant in HTML5 as well, to be honest, because you might want to set things up for a dynamic change of |type|. I don't know where to draw the line there. (Similarly; should empty paragraphs be conformant? I often use empty paragraphs as somewhere to later fill in some text.) > Actually I think it would be beneficial for interoperability and perhaps > discovery of weaknesses in the spec, if several schemas were developed > by independent parties during the call for implementation. Absolutely. On Thu, 7 Apr 2005, [ISO-8859-1] Olav Junker Kj?r wrote: > > Actually, the HTML element has a (deprecated!) version attribute, which > could be used for this purpose. I agree it feels cleaner than using the > doctype syntax. It's not clear to me what the purpose would be. > OTOH authors are going to use doctypes for the forseeable future anyway, > since they want to trigger standards compliant mode in browsers, so we > might as well put the doctype to some use. What use? On Thu, 7 Apr 2005, [ISO-8859-1] Olav Junker Kj?r wrote: > > A conformance checker is a rubber stamp. Therefore its quite important > that a conformance checker actually checks conformance to the spec, > otherwise it is snake oil. Hear hear! > As HTML applications becomes more complex it becomes more important that > the markup and code is correct, but DTD-validation becomes even less > sufficient to catch errors. A basic validity error like forgetting to > close an <b>-tag will not cause the page to stop working. However, a > syntax error in the initial value of a date control *will* cause the > page to stop working as intended. Indeed. > > now I realise it's to the advantage of existing browser manufacturers > > to rubber stamp complicated heuristic behaviour they've already solved > > into a spec (it prevents new entrants from coming along) but how is > > it to the advantage to the rest of us - understanding specifications > > becomes harder and harder and relies on the fact that we knew what > > happened before... > > If you are referring to the paragraph about parse errors in > <http://whatwg.org/specs/web-forms/current-work/#handling> I tend to > agree with you. In HTML5 there is less and less that is left up to reverse engineering. Hopefully that addresses your concern; I hope to continue in this direction to the point where eventually maybe there will not be any need for reverse engineering at all. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 24 February 2006 14:57:33 UTC