W3C home > Mailing lists > Public > public-html@w3.org > March 2010

Re: Re-registration of text/html

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Wed, 10 Mar 2010 22:00:38 +0100
To: Henri Sivonen <hsivonen@iki.fi>
Cc: Julian Reschke <julian.reschke@gmx.de>, Ian Hickson <ian@hixie.ch>, HTMLwg <public-html@w3.org>
Message-ID: <20100310220038640438.0ae603e2@xn--mlform-iua.no>
Leif Halvard Silli, Wed, 10 Mar 2010 19:46:53 +0100:
Henri Sivonen, Wed, 10 Mar 2010 07:12:54 -0800 (PST):
> "Leif Halvard Silli" <xn--mlform-iua@målform.no> wrote:
>> Henri Sivonen, Wed, 24 Feb 2010 15:20:22 +0200:
>>> On Feb 21, 2010, at 11:35, Julian Reschke wrote:
>>>> On 21.02.2010 10:09, Ian Hickson wrote:
 .....
>>> As one can see, the RFC's wording about compatibility with "the 
>>> Web"/preparedness for "the Web" is not congruent with what section
>>> 12.1  in the HTML5 draft [which] says:
>>> 
>>> 	]]This document is the relevant specification. Labeling a 
>>>       resource with the text/html type asserts that the 
>>>       resource is an HTML document using the HTML syntax.[[
>>> 
>>> The wording "HTML syntax" has a link pointing to 
>>> <http://dev.w3.org/html5/spec/syntax.html#syntax>, which indicates
>>> that it is meant that the document uses _HTML5 syntax_.
>> 
>> This philosophical question could be avoided by stating that 
>> documents labeled "text/html" must be processed according to HTML5.
> 
> This sounds more reasonable. But I don't understand the focus on 
> version 5. Stating that documents labeled 'text/html' must/will be 
> processed according to the HTML parsing rules, of which HTML5 is the 
> latest version, seems more accurate.

Another way to avoid this "philosophical question" is to update the 
RFC, rather than moving the 'text/html' label ownership to the HTMLwg. 
This seems better because, as Julian has said, 'text/html' is used by 
several sibling languages, and this fact is already expressed in the 
RFC. Whereas the voices that have advocated for moving the registration 
into the HTML5 specification have used arguments that falls in the 
category of "taking control over 'text/html'. E.g to say that 
"everything that is served as 'text/html' *is*  HTML(5)" does in my 
view not pay enough attention to the polyglot culture of the 
'text/html' Web. 

The XHTML2 working group is meant to soon (re)announce that XHTML 1.1 
documents can be served as 'text/html'. As I understand it, they will 
as part of this e.g. allow the @lang attribute - as @xml:lang doesn't 
work in 'text/html'. But they surely do not do this because they want 
to subject to all of the HTML5 spec draft's syntax limitations.

I am honestly not (any longer) sure what kind of confusion it is meant 
to clear up when you make this strong link between 'text/html' and 
HTML5. I think the only confusion there has been "in the Wild" is about 
the "automatic goodness/semanticness of XHTML" - plus all the fuzz 
about how to (not) serve XHTML ... I remember that people that I saw 
that bought into the message that "XHTML is not the solution", 
converted to HTML4 instead. (I guess I could dig up a comment or two 
from Anne's blog. And, hey, you can place me in that category as well 
...) However, one could claim that HTML5 isn't too much about HTML4 
either. I remember "happy" notes from at Anne's blog about institutions 
that required use of HTML4 instead of XHTML ... Which lead me to 
believe that there were some kind of respect for HTML4 in HTML5 circles 
... :-D

And at any rate, my own experience is that HTML4 has perhaps more 
problematic syntax rules (w.r.t. parsing) than XHTML has. E.g. I have a 
JavaScript book where the author claims that XHTML's requirement to use 
<!CDATA[ ]]> inside the <script> element is a reason to use external 
JavaScript files ... However, HTML4's syntax requirements for <script> 
seems like a much better reason to use external javascript files 
(because escaping every '<' inside the script element is much more work 
than learning the CDATA syntax ...)

So the more I understand the inner logic of the syntax rules that HTML5 
defines, the better I understand the intent to solve the issues of both 
HTML4 and XHTML1 - the intention to provide a syntax that works more 
like the parsers that User Agents operate with.

But: Since one of the goals of HTML5 is to focus on 'text/html' parsing 
as separate from 'xml' parsing, then this should open up for making 
'text/html' extensible, simply because the parsing is standardized. And 
thus there should be more, rather than less, reason to say that 
text/html registration should be in a shared draft rather than being 
governed by the owners of the HTML5 specification. The most important 
than thing to say in the RFC seems to be that 'text/html' parsing is 
being standardized. And that the main part of 'text/html' parsing is 
defined in HTML5, but that applicable specifications as well as XHTMl 
specifications (which could count as applicable specifications too) 
play a role as well.
-- 
leif halvard silli
Received on Wednesday, 10 March 2010 21:01:18 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:59 UTC