- From: Ian Hickson <ian@hixie.ch>
- Date: Tue, 30 Dec 2008 10:12:19 +0000 (UTC)
- To: Julian Reschke <julian.reschke@gmx.de>
- Cc: Larry Masinter <masinter@adobe.com>, "www-tag@w3.org" <www-tag@w3.org>
Extensibility is but one reason why detailed specifications that don't leave things undefined are essential to the successful development of a multi-vendor technology stack, there are many others, as discussed in earlier e-mails. For the purposes of this e-mail I only look at the extensibility aspect, though. On Tue, 30 Dec 2008, Julian Reschke wrote: > Ian Hickson wrote: > > > > Application statements don't limit innovation. In fact, having > > detailed specifications that define the precise rules for parsing and > > that give precise rules for the processing models and so forth > > dramatically increase the ease with which the respective protocols can > > be extended, because there is no guesswork about exactly how various > > implementations are going to handle the new syntax. > > They can help with extensibility; but they can also ruin it. Well certainly it is possible to design the language in a bad way. With a well-written spec, i.e. an "application statement" or a "implementation functional specification" as Larry called them, if the language is designed right, innovation is possible. CSS2 is a great example of this. With a poorly-written spec, i.e. one that doesn't fully detail the entire processing model but instead leaves things undefined, innovation is extremely hard. HTML4 is a good example of this. The difference is that in the first case, there is a conscious decision made about what the processing model should be (including, but not limited to, things like error handling). Thus there is the possibility of making the right choice and getting an extensible language. In the second case, there is no decision to make, and thus there is no chance of ensuring that the language is practically extensible. > For instance, if a specification requires recipients to accept *any* > kind of broken input (by specifying how to parse it anyway), it > essentially takes away all future extensibility with respect to syntax. It's not the specification that allows or disallows extensibility, it's the behavior of the down-level clients. If the down-level clients all handle the desired new syntax in the same way, then extensibility is possible. If they don't, then extensibility is hard to impossible. We can get consistency by having a spec that defines all the ways of handling input, including content that doesn't conform to the current syntax. Whether that is "accept any kind of broken input" or "have a fatal error as soon as the content is unexpected" is another issue altogether than what I am discussing here. (In practice, there are roughly speaking three ways to handle unexpected content -- fatal error, ignore it, or correct it. A fatal error makes it extremely hard to extend the language, because it means all extensions violate backwards compatibility. Thus, for instance, the difficulty with upgrading XML from 1.0 to 1.1. Similarly, if the error handling consists of correcting the author intent and handling it in some special way, it is hard to extend the language because extensions have to be designed around the legacy behavior. The better solution, and the one picked by CSS, is to use a "must-ignore" model for all unknown syntax.) > > For instance, adding a new property or new syntax to CSS is easy, > > because CSS defines forward-compatible parsing rules, so there is no > > ambiguity about how a down-level browser is going to process new > > features. However, adding a new element to HTML is incredibly > > difficult, because every browser differs in how it handles unknown > > syntax, because the specs never covered this case. > > But HTML5 to some degree has the same problem: as the set of void > elements is hard-wired into the spec, no new void elements can be > introduced. That by itself would be fine if it actually guaranteed that > future versions of the language won't introduce new void elements, which > it doesn't. Yes, HTML has a terrible forward-compatibility story. We have ended up forced into this situation mostly because the earlier versions of the spec didn't define the full processing model, and thus user agents varied greatly in their behavior. Thus, instead of a coherent, well-thought-out extension model, we have a de-facto extension model derived from a long series of accidental decisions by a wide variety of independent people. This is another example of what happens when specifications don't have clear and fully defined processing models. > > Similarly, if we were to extend XML's xml:preserve attribute to have a > > xml:space? Yes, my apologies. > > third value, we couldn't do so without checking how all the different XML > > processors would handle the new value, because XML doesn't define how to > > handle unknown values. ... > > Yes, it does: > > "The value "default" signals that applications' default white-space processing > modes are acceptable for this element; the value "preserve" indicates the > intent that applications preserve all the white space. This declared intent is > considered to apply to all elements within the content of the element where it > is specified, unless overridden with another instance of the xml:space > attribute. This specification does not give meaning to any value of xml:space > other than "default" and "preserve". It is an error for other values to be > specified; the XML processor MAY report the error or MAY recover by ignoring > the attribute specification or by reporting the (erroneous) value to the > application. Applications may ignore or reject erroneous values." -- > <http://www.w3.org/TR/xml/#sec-white-space> > > So conforming processors either report an error or ignore the attribute; > thus xml:space can't be extended *because* it defines error handling, > not because it doesn't. The key part is "the XML processor MAY report the error or MAY recover by ignoring the attribute specification or by reporting the (erroneous) value to the application". This is the three options I gave above -- report an error, ignore the error, or report the value to the next level and let it deal with it however it wants, i.e. interpret and "correct" it. That isn't going to lead to interoperability. Some UAs will have fatal error handling, some will ignore it, some will "correct" it and treat it as 'preserve', some as 'default', some maybe as yet something else -- and all would be conforming! Some will ignore the attribute and have the parent's value inherit down as if the attribute wasn't there, others will not ignore the attribute and will thus have the value not be inherited... If you consider this "defining how to handle unknown values", then we have very different ideas of what is meant by "define". -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 30 December 2008 10:13:01 UTC