- From: Robin Berjon <robin@w3.org>
- Date: Wed, 26 Sep 2012 13:08:26 +0200
- To: Noah Mendelsohn <nrm@arcanedomain.com>
- CC: W3C TAG <www-tag@w3.org>
On 25/09/2012 16:14 , Noah Mendelsohn wrote: > On 9/25/2012 9:27 AM, Robin Berjon wrote: >> I believe that the idea is that once the rules that describe >> processing as it happens are written down, you write test suites >> that can prove conformance. This does tend to have a strong effect, >> particularly if coupled with rules about processing erroneous >> input. > > Well, half of the HTML5 spec is devoted to documenting cases where > individual browsers were liberal, the conformance suites (like the > W3C validator) were not used by producers, the invalid data on the > wire became commonplace, and now the specification is complicated by > the need to support the union of all these deviations. > >> Well-defined error handling that produces something predictable >> (rather than blow up) is actually a modern and more pragmatic >> reformulation of Postel, IMHO. > > My concern is not with well-defined error handling; it's with not > putting equal emphasis on inducing producers to cleanup their act > too. And, precisely, I believe that it is a misunderstanding to interpret the way in which the HTML specification (and now many others) as anything other than a systematic attempt to prevent producer drift. I further believe that the approach taken here is of an architectural nature and that it would be well within the TAG's remit to investigate it in greater depth. To simplify, you essentially have three possible approaches (for all of which one can find many existing examples). A) Define the behaviour of conforming content, leave the behaviour of erroneous content undefined. B) Define the behaviour of conforming content, catch fire on everything else. C) Define the behaviour of all content, non-conforming content is defined and can be flagged by quality tools (such as validators). These three approaches produce different incentives in a technological ecosystem involving multiple implementations distributed over a large scale in both space and time. With case (A), it is likely that there will be implementations that will have defined behaviour for erroneous content (whether intentionally or through bugs does not matter). People will end up relying on that working, and implementations will need to copy one another's bugs. The standard will need to catch up (painfully). Since you can't test for undefined behaviour, there is nothing you can do to prevent this drift. With case (B) you can test that processors do indeed catch fire and so can prevent the drift (this has been overall very successfully shown with XML). But in a user-facing system, catching fire is hardly the friendliest thing to do — especially if the content is composed from multiple sources (that may be highly combinatorial) outside the reach of the primary content producer. Case (C) is essentially case (B) but with well-defined behaviour that is richer and more elaborate than just blowing up. It assumes that errors will happen (in that it is somewhat reminiscent of the design decisions made in Erlang as contrasted with other languages) and that end-users should not be ones dealing with them. This can be thoroughly tested for such that as in (B) it is possible to assert conformance of a processor for all content and therefore avoid the drift typical in (A). It also provides a solid foundation for a versioning strategy. None of these approaches guarantees quality, but (B) and (C) guarantee interoperability, and (C) guarantees user-friendliness. You seem to believe that the approach taken in HTML and other such specifications is to prolong the mess generated by A-type standards — in fact it is the exact opposite. Once the mess left by the unavoidable drift in A-type standards is properly accounted for and grandfathered, technology development can proceed sanely. The vast increase in HTML-based innovation over the past few years is a testimony to this. This could be seen as involving a reformulation of Postel: "Be universal and well-defined in what you accept; don't be daft and quality-check what you produce." Yeah, it doesn't read quite so well. But its ecosystemic properties are actually far more robust over time. -- Robin Berjon - http://berjon.com/ - @robinberjon
Received on Wednesday, 26 September 2012 11:08:33 UTC