Re: Design Principles

On Wed, 27 May 2009, Leif Halvard Silli wrote:
> > > > > > >
> > > > > > > Another quote from the same page: "imperative that HTML be 
> > > > > > > extended in a backwards-compatible way".
> > > > > > >
> > > > > > > So HTML 4 is winning. And HTML 5 has to be 
> > > > > > > backwards-compatible.
> > > > > > >
> > > > > > > It really sounds from this as if it is very important to be 
> > > > > > > compatible with HTML 4.
> > > > > >
> > > > > > No, being backwards compatible with the HTML4 spec is 
> > > > > > worthless. It's being backwards compatible with legacy content 
> > > > > > and implementations that matters (and that has been a 
> > > > > > cornerstone of the HTML5 effort).
> > > > >
> > > > > So it was not the HTML 4 of the spec that was winning but 
> > > > > another HTML4?
> > > >
> > > > In the context of the interview, what is the difference between 
> > > > these two HTML4s? I don't understand the question.
> > >
> > > Tell me about that other HTML 4, please. I really wonder how one can 
> > > say that HTML 4 is winning and mean that something that isn't in the 
> > > HTML 4 spec is winning.
> >
> > I didn't say the HTML4 _spec_ was winning, I said HTML4 was winning; 
> > that is, the HTML language as deployed on the Web (what you would 
> > probably call "text/html", but most people wouldn't understand that, 
> > so I didn't say that in the interview).
>
> A spec can never win in any other way than through deployment, can it?

In the context of the interview, I was talking about the competition 
between open vendor-neutral technologies like HTML, and proprietary 
(vendor-specific, under the control of one vendor, patent-protected) 
technologies like Flash and Silverlight.

In this context, there are more end-user installations of implementations 
that interpret the syntax, elements, attributes, DOM APIs, and other 
aspects of what is generally known as HTML, and there are more instances 
of documents that use the syntax, elements, attributes, DOM APIs, and 
other aspects of what is generally known as HTML, than there are installed 
implementations of, and instances of documents using, the proprietary 
alternatives. Thus, in that context, HTML can be considered to be 
"winning" in the competition that includes those particular alternatives.

Again in the context of the interview, I referred to this as "HTML4" in 
contrast with "HTML5" to indicate the generation of the language, 
contrasting the features that are widely implemented today (roughly 
speaking the kinds of things tested in Acid2 and Acid3 and their 
contemporary features) with the new features present in HTML5 and more 
recent specifications that are only now seeing implementation work 
(Geolocation, Web Workers, CSS3, etc).

The HTML4 spec, however, only bears a vague resemblence to the syntax, 
elements, attributes, DOM APIs, and other aspects of what is generally 
known as HTML as implemented today and contemporary to Acid2 and Acid3, 
even though the HTML4 specification presumes to define what that is.

Thus, despite the success that the language widely known as HTML4 has had, 
the specification that presumes to define that language is of limited use 
in evolving the actual deployed language. This is why arguments relating 
to the requirements in HTML4, the specification, are not particularly 
useful in determining what we should do in HTML5.


> To say that "text/html" is winning is not the same as saying that "HTML 
> 4 deployed" is winning.

It sounds more or less the same to me, unless you are saying that "HTML4 
deployed" would refer only to what is required by the HTML4 spec, which, 
one must admit, is definitely not winning, and indeed barely exists at 
all. (For instance, the only widely-used implementation I know of that 
actually parses HTML4 as per the spec is the W3C validator.)


> That HTML 4 is underspecified is one thing. But if the deployed HTML 
> cannot in some vague or idealistic manner point to HTML 4 as the basis 
> for the way it is implemented, then I cannot see how it is is "HTML 4.01 
> Deployed" we are talking about.

You can definitely point to the HTML4.x series of specifications in a 
vague or idealistic manner as the basis for what browsers implement, or at 
least as _a_ basis for what browsers implement, but that isn't very useful 
in terms of HTML5's development.

(Indeed, refering to "HTML4" in this way is exactly what I did in the 
interview, using the version number of the spec to indicate the general 
era and area of technology to which I was referring.)


> The way you have authored the HTML 5 draft is not the only possible way 
> it can/could look.

Clearly.


> It is entirely possible to build more closely on the concepts that are 
> found in HTML 4 while at the same time improving all the underspecified 
> sides of HTML 4.

It may be possible, though I don't know how. I encourage you to try if you 
think doing so would result in a better specification than HTML5.

Just because something is possible doesn't mean it is wise, however.


> So there is more to this than the vagueness of HTML 4.

Not really. I would imagine that an effort such as that that you describe 
would be inordinately more difficult than the approach that was used in 
developing HTML5. But that's merely my opinion; I encourage you to prove 
me wrong. I don't expect I can convince you by just referring to my own 
personal experiences.


> > In fact, it is the vagueness of the relationship between "HTML4-as- 
> > deployed", what one might call "reality", and "HTML4-the-spec", what 
> > one might call "theory", which is one of the biggest problems that I 
> > am trying to fix with HTML5. My goal is that with HTML5 there be no 
> > difference between how HTML5 is deployed in implementations and how 
> > the spec _says_ it should be deployed in implementations.
>
> When we say "reality" then we usually mean something that /differs/ from 
> what theory says about the same reality.  If HTML 4 is silent about 
> something, then there is no reality to differ from.

HTML4 is silent about much, but it isn't silent about everything. What it 
is not silent about is usually wrong (e.g. saying browsers must not have a 
default encoding, whatever that means, or saying that all browsers, even 
speech synthesisers, must render quote marks around <q> elements, or 
saying that the default media="" is "screen", or saying that parsing 
should be done using SGML, or...).


> > > The high deployment of HTML that you talk about includes a lot of 
> > > XHTML.
> >
> > As I see it there are two ways to define "XHTML" deployment: 
> > Deployment in the sense that documents have an XHTML DOCTYPE, and 
> > deployment in the sense that documents actually get processed 
> > according to the XHTML specification's rules (e.g. using an XML 
> > parser).
>
> Of course.
>
> > Last I checked, about 15% of content had an XHTML DOCTYPE.
> > Last I checked, about 0.002% of content was processed as XHTML.
> >
> > I don't consider the presence of the DOCTYPE an indicator of 
> > deployment in any useful sense. I don't consider 0.002% a high 
> > deployment rate.
>
> Those 15% can at least not be counted as "HTML 4 as she are spoke".  
> Perhaps we could call it "XHTML treated as HTML 4 are spoke".

I don't understand the relevance of this line of argumentation.

In practice it doesn't matter what the DOCTYPE is; it has little bearing 
on which specification the rest of the document more closely follows, and 
it has no bearing (beyond quirks mode detection) on what the browsers do 
with the content.


> > "HTML4-the-deployed-language" is clearly a wild success. If it wasn't, 
> > I wouldn't be interested in working on HTML5! There's a huge 
> > difference, however, between the language as deployed, and the 
> > language as specified.
>
> So, how shall I consider that you view that "huge" difference? Do you 
> mean that the deployed HTML 4 has rules for things that specified HTML 4 
> doesn't have? Or do you mean that deployed HTML 4 in practise has 
> stricter rules/requirements than specified HTML 4 has?

I don't understand what it would mean for deployed content or 
implementations to have rules.


> Or do you mean that deployed HTML 4 contradicts the specified HTML 4?

Yes, this certainly happens a lot. It's not anywhere near as big a problem 
as the near-complete lack of conformance criteria in HTML4, though, or the 
extreme vagueness of the semantics defined in HTML4.


> I cannot see how one can talk about deployment without reference to 
> specification.

The Win32 API has huge deployment numbers, but no formal specification.

On the Web, the XMLHttpRequest object was deployed and widely used long 
before it had a specification of any kind.

It is quite possible to have deployed technology without a specification. 
It is a sad state of affairs, though, and one which, for HTML and related 
APIs, I have spent many years attempting to redress with HTML5 and its 
related specifications.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Wednesday, 27 May 2009 02:12:56 UTC