Re: Re-registration of text/html

Julian Reschke, Sat, 20 Feb 2010 09:36:55 +0100:
> HTML5 contains IANA instructions to change the specification for 
> text/html from RFC 2854 to HTML5 
> (<http://dev.w3.org/html5/spec/Overview.html#text-html>)
> 
> Thus, if HTML5 forbids something (such as @profile), you can't serve 
> it as text/html anymore, even though you might be using an HTML 4.01 
> doctype. (Well, you *can* serve it as text/html, it "just" wouldn't 
> be correct anymore).

Yes. Validator.nu shows and example of this: It is able to validate a 
page with a HTML4 doctype as a HTML5 page, if the page otherwise comply 
to HTML5. Whereas today, even if HTM3.2 and below is obsoleted by the 
RFC, one can still validate pages even as e.g. HTML2.

Btw, HTML401 was ready in 1999, and the text/html registration happened 
one year later. So the re-registration perhaps happens in 2023? Or 
perhaps the (re-)registration could happen in steps?
 
> There are two ways to fix this
> 
> 1) let the MIME registration continue to allow serving HTML 4.01.
> 
> 2) make more of HTML 4.01 valid HTML5.

2) sounds safest, regardless, for all parties.

> Note that the IETF mime type re-registration rules says:
> 
>    Changes should be requested only when there are serious omissions or
>    errors in the published specification.  When review is required, a
>    change request may be denied if it renders entities that were valid
>    under the previous definition invalid under the new definition.
> 
> (<http://tools.ietf.org/html/rfc4288#section-9>), so my 
> recommendation is that we try to fix this problem before the 
> re-registration is attempted (or at least ask the IESG for advice 
> before we get there).

HTML5 has a different focus compared with RFC 2854.

HTML5's focus is on concrete features of past HTML specifications. 
While RFC 2854 stamps whole versions of HTML as historic/current, 
without touching their specific parts/entities. HTML5 also treats 
doctypes as a feature, this is what enables Validator.nu to stamp a 
page which uses a legacy doctype that doesn't trigger quirks in the 
HTML5 parser, as "HTML5 valid". The RFC doesn't treat the doctype as a 
feature in the same sense. The RFC prepares authors for a world of 
multiple parsers. Whereas HTML5 prepares the world for the HTML5 
parser. 

As a participant of the HTMLwg, I find it interesting that I don't know 
for sure whether I am participating in something which is intended to 
park HTML4 and XHTML1 in a way which means that one cannot validate 
pages as HTML4/XHTML anymore, or not. I think many expect that HTML5 
will just be the thing that lands on the top of the doctype pop-up menu 
in the validator. And hence, that they will be able to continue to 
validate other HTML profiles - such as HTML4. E.g. when I pointed out 
the quirks about how internal DTD subsets are treated in the html5 
parsers, then Maciej replied that I talked about something which was 
outside the scope of HTML5. But how relevant is that, if HTML5 is the 
only thing that defines text/html? Then it must either be defined (in 
an acceptably compatible way) in HTML5, or been forgotten forever.

I personally don't feel that it is very likely that HTML5 will become 
the only validatable expression of text/html. But there are of course 
many ways to dominate the scene without removing that option.
-- 
leif halvard silli

Received on Sunday, 21 February 2010 09:05:04 UTC