W3C home > Mailing lists > Public > public-html@w3.org > March 2010

Re: Re-registration of text/html

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Fri, 12 Mar 2010 05:59:09 +0100
To: Sam Ruby <rubys@intertwingly.net>
Cc: Henri Sivonen <hsivonen@iki.fi>, Julian Reschke <julian.reschke@gmx.de>, Ian Hickson <ian@hixie.ch>, HTMLwg <public-html@w3.org>
Message-ID: <20100312055909447225.0f1e413f@xn--mlform-iua.no>
Sam Ruby, Thu, 11 Mar 2010 21:13:47 -0500:
> Leif Halvard Silli wrote:
>> In other words: So that it becomes possible to serve the same XHTML 
>> 1.1. document both as text/html and as application/xhtml+xml, with 
>> the same semantics.
> If you strike the words "XHTML 1.1", I agree with the above 
> sentence.  I agree that there exists a useful subset of documents 
> which can be served either as XHTML or as HTML and with substantially 
> the same semantics.
> What doesn't make sense to me is your insistence on labelling that 
> such documents as XHTML when (a) the precise interpretation depends 
> on the MIME type, and (b) overall differences in semantics for this 
> subset of documents are negligible.

I'm not sure I understand what your reaction w.r.t. XHTML 1.1. was 
about. But w.r.t. the 'text/html' RFC, then I think it has always been 
clear that XHTML’s presence in 'text/html', has been in lieu of the 
"native" 'text/html' specifications. XHTML 1.1. acknowledges that 
when/if it becomes permitted to use lang="".

But if I read Henri correctly, he now considers that the native XHTML 
media type is defined by the "last" version of XHTML - namely  XHTML5 
...  The different XHTML specifications makes clear that none of them 
define one monolithic spec. Therefore, to say that XHTML5 now is the 
"last" and "best" version of XHTML, seems against what XHTML is about. 
If Henri's perception became the prevailing one, then it would be like 
changing the constitutions of the "XHTML nation".

But the fact that there, until HTML5, hasn't been any clear text/html 
parsing conformance rules, has not only prevented development of 
'text/html', but also has it prevented the development of polyglot 
XHTML based 'text/html'. Which raises the question what 'text/html' 
compatible XHTML *is*? That question was much simpler to answer in year 
2000 than it is today, when HTML5 and Web 2.0 allows 'text/HTML' 
document to be used as a host document for SVG and MathML - the XHTML 
Media type document doesn't mention at all the option of using polyglot 
text/html documents as host documents for SVG and MathML. On the 
contrary, it warns against foreign namespaces! So Henri's objections to 
the 2009 verison of the XHTML Meida document feels outdated, from that 

I do not "insist" on labeling docs authored according to a XHTML spec 
as XHTML documents, if they are served as 'text/html'. It would perhaps 
be better to use the term that you have used - polyglot documents. The 
term 'polyglot' refers primarily to how the syntax looks like. And it 
is also the syntax I have in mind when I talk about a 'text/html' 
document as an XHTML document.

> Be that as it may, I note a related discussion on the validator forum:
>   http://comments.gmane.org/gmane.org.w3c.validator/12444

I think that example is relevant with regard to what extent the most 
'draconian' rules of HTML5, such as the many attributes that it 
forbids, can be considered part of the polyglot requirements. If we 
look at Appendix C of XHTML 1, then it defines rules which are not in 
accordance with HTML4 syntax.

 It is nice that XHTML5 documents can be used to author polyglot 
documents. But that rehearsal do not help me if I would like to use the 
HTML5 forbidden attributes. Then a XHTML doctype might serve me better. 
Another question in that regard is the presence of namespace prefixes 
inside e.g. SVG elements. 

HTML5 from one angle makes it easier to author polyglot documents. But 
from another angle, one can ask if it has anything to do with 
polyglotism to do whether one removes harmless namespaces inside e.g. 
<svg> and other attributes.

There are at least two definitions of polyglot documents: That one 
limits oneself to syntax which is permitted in two dialects. Or that 
one limits oneself to syntax which has the same 
meaning/semantics/effect in both dialects.  And a third defintion: Both 
the exact same syntax and the same semantics. As long as one authors 
according to XHTML5/HTML5, then only the third option seems availible.

Speaking of which: Your blog doesn't validate as HTML5, it only 
validates as XHTML5. (But perhaps that is just a matter of Henri 
becoming ready with his polyglot checker?)

But, here you also see why I *do* talk about XHTML documents:  If the 
syntax *isn't* compatible with any documented 'text/html' syntax, 
despite that it is compatible with 'text/html' parsers, then it isn't 
without meaning to say 'XHTML' even if it is served as text/html.

> I can't help but wonder if one small change to HTML5 that would 
> reduce this confusion, and yet would have zero inpact to browser 
> vendors.  This change would be to change the definition of the xmlns 
> attribute on the html element from a talisman to a trigger of a few 
> additional, yet simple, validation checks.  To start with, it would 
> trigger validation errors when elements are implicitly closed.  Other 
> checks could also be considered.
> A few notes: the intent is not to guarantee well-formedness, nor to 
> change browser behavior, but merely to provide a means for someone 
> who wishes to opt in to a more strict syntax to indicate their desire 
> to do so.  This also clearly would have no impact on those who 
> advocate the use of a more minimal syntax.


[1] http://www.w3.org/MarkUp/2009/rdfa-for-html-authors


leif halvard silli
Received on Friday, 12 March 2010 04:59:47 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:45:13 UTC