W3C home > Mailing lists > Public > public-html@w3.org > February 2010

Re: Re-registration of text/html

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Wed, 24 Feb 2010 15:45:12 +0100
To: Henri Sivonen <hsivonen@iki.fi>
Cc: Julian Reschke <julian.reschke@gmx.de>, Ian Hickson <ian@hixie.ch>, HTMLwg <public-html@w3.org>
Message-ID: <20100224154512877233.c4451c1f@xn--mlform-iua.no>
Henri Sivonen, Wed, 24 Feb 2010 15:20:22 +0200:
> On Feb 21, 2010, at 11:35, Julian Reschke wrote:
>> On 21.02.2010 10:09, Ian Hickson wrote:

My view/questions differs from that of Julian w.r.t. what he problems 
are.

>> What's important is whether the new text/html will allow existing 
>> HTML4 content to stay valid;
> 
> There are really multiple cases there:
>  1) Should pre-existing valid HTML4 continue to be valid HTML4? (I'd 
> say yes.)
>  2) Should pre-existing valid HTML4 be valid HTML5? (I'd say it 
> doesn't need to as far as precedent goes. Valid HTML 3.2 is never 
> valid HTML4.)

The goal of HTML5 is version-less HTML? Or is it only HTML5 and onwards 
that will be version-less? 

Why does Validator.nu offer to validate HTML4 documents as HTML5 
documents? Why does it offer to validate text/html XHTML1 documents as 
HTML5? Etc.

The sensical thing, if HTML4 documents are not meant to be valid HTML5 
documents, is - for the HTML5 spec  - to _forbid_ any other doctype 
than <!DOCTYPE html> and the legacy doctype for HTML5 documents. Of 
course, the HTML5 parser could specify how to handle such doctypes, but 
their presence should still make a document invalid as a HTML5 document.

In other words: the DOCTYPE must be treated as a version indicator, 
with non-quirks side effect, rather than being treated as a non-quirks 
trigger - solely.

>  3) Should pre-existing invalid content purporting to be HTML4 be 
> valid HTML5? (I'd say it doesn't need to be, because it wasn't valid 
> HTML4, to begin with.)
>  4) Should pre-existing valid HTML4 continue to be appropriate for 
> serving as text/html? (Whatever we say, it'll continue to be so 
> served.)
>  5) Should pre-existing invalid content purporting to be HTML4 be 
> appropriate for serving as text/html? (Has it been previously?)
> 
> Note that case #5 is by far more common than case #4!
> 
>> content including things like @profile, for instance. Right now it 
>> doesn't, and I believe that is a problem.
> 
> What concrete badness do you expect to ensue if this "problem" remains?

The problem that I see is that HTML5 defines a parser and that the 
current version of the HTML5 spec draft says that the HTML5 parser 
should ignore the @profile attribute.

There quite a few similar issues. E.g. HTML4 supports image maps were 
one uses <a> instead of <area> - HTML5 does not have this feature 
(currently) - and I heard from Boris and Anne that they would be so 
happy to remove that feature from their respective browsers. @summary, 
@longdesc etc belongs to the same set of issues.

So the concrete problem is the parser - that HTML5 blesses removal of 
features that are important to handle HTML4 documents.

But may be there will arrive vendors that say "we do also support HTML4 
- we do not limit ourself to the HTML5 compromise" ?

> Do you believe in ever obsoleting specs? Does your concern about 
> HTML4 extend to HTML 2.0? If not, why not?

Except for the very doctypes themselves of those specs, are there 
things in HTML32 and HTML2 that did not make it to HTML4?

I must say that the way I read the text/html RFC, then HTML32 and HTML2 
are obsoleted/historical - the question is what 'obsoleted' eventually 
means. The RFC also says that one must be prepared for documents that 
are made according to those specs. One must also be prepared for many 
extensions and bugs. I think the RFC, in saying so, emphasizes the 
semantics. Whereas the HTML5 parser specifies to ignore many semantics 
in favour of some "average semantic" that HTML5 defines (for HTML5 
parsers).

As one can see, the RFC's wording about compatibility with "the 
Web"/preparedness for "the Web" is not congruent with what section 12.1 
in the HTML5 draft says:

	]]This document is the relevant specification. Labeling a 
      resource with the text/html type asserts that the 
      resource is an HTML document using the HTML syntax.[[

The wording "HTML syntax" has a link pointing to 
<http://dev.w3.org/html5/spec/syntax.html#syntax>, which indicates that 
it is meant that the document uses _HTML5 syntax_.
-- 
leif halvard silli
Received on Wednesday, 24 February 2010 14:45:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:17:02 GMT