Re: Backwards compatibility and DOCTYPE

On Wed, 5 Mar 2008, Bert Bos wrote:
> On Wednesday 05 March 2008 05:28, Ian Hickson wrote:
> > On Wed, 5 Mar 2008, Bert Bos wrote:
> > > The HTML5 WD states (section 1.1.1[1]) that the format is meant to
> > > be as much backwards-compatible as possible. With a little change
> > > to section 8.1.1[2], HTML5 could, in fact, be fully backwards
> > > compatible.
> >
> > What do you mean by backwards compatible in this context? HTML5 
> > doesn't claim that all legacy documents are conforming HTML5 documents 
> > (in fact no legacy documents are conforming HTML5 documents);
> 
> Yes, and I wonder why. HTML5 can easily say that (most? all?) valid 
> HTML4 document are also valid HTML5 documents.

The main reason is that we didn't want to confuse the issue of versioning 
by claiming that documents labeled "HTML4" were in fact HTML5.

Going forward we've resolved this -- since HTML5 has no version number in 
the source, HTML5 and HTML6 can't be distinguished, and HTML6 can safely 
take over HTML5's documents and wipe HTML5 off the face of the earth. :-)


> > it only claims that HTML5 user agents will process legacy documents in 
> > a manner compatible with legacy user agents.
> 
> That's not clear. The HTML5 draft says an incorrect DOCTYPE is an error 
> and while some UAs may silently ignore the error, others may (or even 
> must) report it.

Yeah, a better way of phrasing it is that the HTML5 draft says how HTML5 
user agents can process legacy documens in a manner compatible with legacy 
user agents. (Error recovery is not required in all cases, e.g. when 
parsing, because some use cases -- such as preprocessors intended for 
developers who have control over the markup -- don't need to handle broken 
code, since the developer is right there to fix it.)


> > > The current version (4.01) of HTML requires[3] documents to start 
> > > with this DOCTYPE line:
> > >
> > >     <!doctype html public "-//W3C//DTD HTML 4.01//EN"
> > >     "http://www.w3.org/TR/html4/strict.dtd">
> > >
> > > But that line is not allowed in the latest draft of version 5. Why 
> > > not?
> >
> > Because that line is HTML 4.01, not HTML5. If you want to write HTML 
> > 4.01, the HTML5 spec is not relevant.
> 
> Yes, it is. Once HTML5 is a REC, *it* defines HTML (see, e.g., sections 
> 1.3 and 1.4.1) and HTML 4.01 is no longer relevant. It would be a pity 
> if old documents suddenly stopped being HTML, when they only differ in a 
> line that is "mostly useless" (as the draft says).

They don't stop being HTML, they just aren't conforming HTML5 docs. They 
are still conforming HTML4 docs.


> It's nice that HTML5 takes forward compatibility into account (by not 
> including a version number in document instances), but I don't see why 
> it has to break with the past. I know previous versions of HTML had the 
> same problem, but that is not a reason to repeat the mistake.

I think that HTML5 claiming that documents that say "HTML4" are actually 
HTML5 is something that would be much harder to sell.

In practice, it doesn't really matter much.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Friday, 23 May 2008 03:30:14 UTC