W3C home > Mailing lists > Public > www-html@w3.org > November 2006

Re: XHTML 1.0, section C14

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Tue, 21 Nov 2006 21:13:38 +0200 (EET)
To: www-html@w3.org
Message-ID: <Pine.GSO.4.64.0611212050090.10636@mustatilhi.cs.tut.fi>

On Tue, 21 Nov 2006, Shane McCarron wrote:

> Think of [appendix C] as hints for creating well-formed, 
> valid XML that should work in HTML user agents.

What this really means, I think, is that it is a collection of tricks 
aimed at fooling old web browsers into processing XHTML 1.0 documents 
reasonably, treating them as if they were legacy HTML, according to their 
old tag soup slurping habits.

I don't think there is much room for improvement in that area, though the 
appendix is rather technical for a set guidelines for practical authoring.

It's the basic idea that is flawed. There is no point in using XHTML as 
the delivery format on the web, now or in the next few years. Authors who 
wish to use XML (and maybe XHTML) internally should be encouraged and 
advised to convert it to HTML for delivery to browsers and other user 

Immense waste of effort has been created by encourageing web authors to 
use the "latest recommendation" in practical authoring. This was probably 
seen as the only way to give XHTML a "push". But it was a big mistake.

If your document is nominally XHTML 1.0 but follows the guidelines of 
appendix C, you are just serving HTML 4.01 in XML clothes and don't win 
anything, but there's a small risk of causing problems with _some_ old 
user agents (e.g., extreme rarities like browsers with a correct HTML 4.01 
parser). If your document is XHTML that does not follow the guidelines, it 
won't work (or it will work by accident only) on the most common browser. 
Anything that you might possibly gain by using XML-based HTML means a 
gross risk on the web.

Things would be different if a server could _know_, upon receiving a 
request from a client, whether the client wants HTML 4.01 or XHTML and the 
clients' requests would match their actual abilities. _This_ is the 
problem that should be solved first. The current techniques for browser 
sniffing, based on rejecting the information that IE sends in Accept 
headers and trying to recognize the _browser_ might work in the hands of 
educated authors, but they are surely not something that should be 
recommended to authors in general.

The whole appendix C should actually be replaced by the following 

"P.S. Please don't use XHTML as the format of your web pages yet."

> If you have content that uses 
> the features of XHTML described in the Appendix, using those features in the 
> manner described should give you the best success rate in the real world.

_Assuming_ you use XHTML in the "real world" (read: World Wide Web) in the 
first place. But why would you? There's a better success rate, which much 
less effort, when you use HTML 4.01. At present, and in the 
foreseeable future, XHTML 1.0 is just expensive liturgy.

> I have copied 
> www-html-editor on this so that it will bet into the HTML Working Group issue 
> tracking system and addressed.

I'm posting this to the www-html list only, since I don't think my points 
are editorial.

Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Tuesday, 21 November 2006 19:13:51 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 30 April 2020 16:21:01 UTC