FW: versioning, robustness principle, doctypes etc

Re ACTION-108 / ISSUE-4:

I feel like I've made some progress on the HTML versioning issue and an argument for
why an explicit in-band version indicator is useful (i.e., a DOCTYPE for HTML 5 which
changes with each incompatible change of the specification.

The important bit in reasoning about this is to distinguish between language-as-used
and language-version-as-specified, and further to distinguish, in
language-version-as-specified, between rules on being "liberal in what
you receive" and rules for being "conservative in what you send".

I expect to have a better writeup of the discussion and reasoning integrated
into a TAG document on versioning, but I thought I should report progress.

I think this is relevant to the discussion about charset parameters, too.

Larry
--
http://larry.masinter.net

From: Larry Masinter
Sent: Wednesday, August 05, 2009 1:09 AM
To: 'www-tag@w3.org WG'
Subject: versioning, robustness principle, doctypes etc

I feel like I've been making progress on versioning and some things around HTML as well, but I've had trouble editing the document to reflect it because there's more to include, so I thought I'd at least send a note.

I was working on the definition of "language" and "language specification" to distinguish between "language as used and deployed" and "language as written down and defined in a standards specification".

The difficulty of discussing these I think has led to some of the confusions in HTML discussions, for example.

So the interesting observation was around languages, language versions and the robustness principle: "be liberal in what you accept and conservative in what you send".

A language specification doesn't just define "a language". It gives rules for both the conservative language - what speakers or writers of the language should do to insure maximum comprehensibility - by using correct grammar, spelling, punctuation, and semantics - as well as what liberal listeners/readers should do to be able to understand the language "as spoken" .

Conformance  checkers are liberal -> conservative transducers of a sort receivers - they parse liberally but note where the liberal interpretation doesn't match the conservative guidance. Proxy/translation gateways also need to do liberal interpretation but produce conservative output.

Conformance checks can use a version indicator (doctype for example) to determine which conservative advice should be applied.

Liberal receivers have little use for version indicators - thus the antipathy toward them by the browser makers who so strongly influence HTML5 still.

What constitutes good "conservative" advice is a design choice; certainly compatibility with  a wide range of current receivers is important but also future receivers, and doing a good job of that requires some amount of intelligent prognosticating.

Version indicators are also useful even to liberal receivers if the language-as-spoken changes, perhaps influenced by external events (in the short term) or evolution (in the long term.)  While a liberal receiver will parse and interpret both old and new, in some (admittedly  unusual ones), utterances in the language are ambiguous, and the ability to note version the key to disambiguation.

So I'm coming to believe there's a strong case for version indicators in HTML5, and DOCTYPE in particular. Except that the DOCTYPE should change every time the specification changes, to allow for evolution during the development of the HTML5 spec.

Larry
--
http://larry.masinter.net

Received on Wednesday, 5 August 2009 19:22:24 UTC