W3C home > Mailing lists > Public > www-html@w3.org > December 2002

Re: Is this legal XHTML 1.1?

From: Christian Wolfgang Hujer <Christian.Hujer@itcqis.com>
Date: Sat, 14 Dec 2002 19:01:40 +0100
To: Ian Hickson <ian@hixie.ch>
Cc: Elliotte Rusty Harold <elharo@metalab.unc.edu>, "www-html@w3.org" <www-html@w3.org>
Message-Id: <200212141901.40291.Christian.Hujer@itcqis.com>

Hi Ian, dear list members,

Am Freitag, 13. Dezember 2002 08:30 schrieb Ian Hickson:
> > Yes, but *that* group usually even does not know of either W3C or XHTML.
> They may not know of either, but they are using XHTML because everyone
> else is using XHTML now. Authors write by copying and pasting.
> Just take a look at the insanely hight percentage of new pages being
> writen that have XHTML DOCTYPEs.
What about including a comment like "use a validator to ensure your page is 
correct: http://validator.w3.org/" right after the DOCTYPE? ;-)

> >>    Ergo these authors are not checking for validity.
> >
> > Yes. They should be teached validity first, then XHTML.
> Good luck with that.
Oh, you mean, I shall do that. No, I meant someone should do that. Not I ;-)
No fun, yes, I already do that sometimes.

But I've just found an example of how correct you are, regarding to the usual 
other. I've taken a look at the source code of http://www.microsoft.com/ and 
was totally shocked! Well, of course, that corporation has never been a good 
example of how to correctly implement standards and norms...

Then I've taken a look at:
http://www.microsoft.com/ Grande Catastroph!
http://www.netscape.com/ Not much better.
http://www.opera.com/ Validates as XHTML 1.0
http://www.mozilla.org/ Validates as HTML 4.01 (after setting char enc to 
US-ASCII, which is okay.)

> >>    Most authors (including me) occasionally include at least one error
> >> in their documents, making them ill-formed.
> >
> > Well, the only "error" I occasionally include is namespace declarations
> > that should not be there because some XSLT processors have some really
> > annoying (but still absolutely correct) behaviour about namespace
> > declarations.
> You never write documents that don't validate?
I do, of course, typos happen, but invalid documents even won't get into 
transformation. Not talking of upload... Each validation error is instantly 
reportet by Ant / Xerces.

I wrote non-validating documents some time ago.
For instance, on my homepage, which was last modified in spring 2002, uses tr 
height="1%" which is not existent in XHTML 1.0 Strict. I will not change it 
because the complete site will be thrown away anyway in the next few days.

> > But usually all my XHTML documents are, and all my HTML documents were
> > valid. I check each document for validity *before* upload, even before
> > locally viewing them in the browsers.
> And you never find them invalid when you are writing them?
Nearly never.
It's like with a compiler language. Once you're used to it, the probably of 
making errors decreases rapidly.
And I check validity on all documents using Ant and Xerces or Crimson.

> >>    Ergo the authors that are a cross-section of both groops, and use
> >>    XHTML, are placing invalid XHTML documents on the web.
> > That's true.
> And that is the problem!

But the solution can't be not to use XHTML.
"Many many People use XHTML in the wrong way, so no one must use XHTML [even 
those using it correct]" - I don't agree with that.
What about "People use HTML in the wrong way, so no one must use HTML"...

> >> Since the two groups are huge proportions of the Web authoring
> >> community, as a quick perusal of XHTML sites will show,most XHTML
> >> documents on the web now are invalid.
> >>
> >> Where is the error?
> >
> > But that is not a reason not to use XHTML
> I'm not saying "don't use XHTML". I'm saying "don't send XHTML as
> text/html", for the good of the Web and for your future sanity.

But I keep sending XHTML as application/xhtml+xml or text/html, depending on 
the browser, for a while.

> >> This document isn't some sort of theoretical excercise. It is listing
> >> practical reasons why using text/html for XHTML is bad.
> >
> > It's bad in the cases described.
> Thank you.
You're welcome ;-)
I think a little flamewar is quite good to strengthen each other's arguments. 
It's not personal.
And yes, that's probably just a bad excuse for an arrogant guy who wrote first 
and thought then concerning the use of the word bad. I say _sorry_ for being 
so rude.

> > But what about a valid XHTML 1.1 document that displays fine even in
> > Netscape 4?
> Since no valid XHTML 1.1 document could ever be sent as text/html, that
> will never happen.
Oh, why can't a valid XHTML 1.1 doc not be sent as text/html?
It's still a valid XHTML 1.1 document then, it's just not real text/html ;-)

> > Why shouldn't such a document be served as .xhtml with
> > application/xhtml+xml to Mozilla, Opera and all other browsers that
> > send an Accept header which contains application/xhtml+xml and .html
> > with text/html to the user agents that don't say they knew
> > application/xhtml+xml like Internet Explorer? Those are tag soup
> > anyway, they "don't know the difference".
> That would be great, if people did it.
> They don't. (With the exception of maybe 3 or 4 sites.)
Well, I do.

And all the sites I'm responsible for will migrate to that behaviour before 
christmas. That'll already be more than 3 or 4 (but still less than 
0.0000001% of the Web, probably)

I even use real link elements, like <link rel="Alternate" rev="Alternate" 
href="index.de" hreflang="de" type="application/xhtml+xml" /> and <link 
title="Startseite" href="/de/haupt_home" hreflang="de" 
type="application/xhtml+xml" rel="Start" />

> >> One of those reasons is that if you ever switch your XHTML documents
> >> from text/html to text/xml, then you will in all likelyhood end up with
> >> a considerable number of XML errors, meaning your content won't be
> >> readable by users. (Most XHTML documents do not validate.)
> >
> > See above. When I speak of XHTML I speak of validating XHTML.
> When I speak of XHTML I speak of anything with an XHTML DOCTYPE, valid or
> not. Shutting your eyes to the reality of the mess being created here does
> no-one any good.
Invalidity sometimes not really is a big problem. Add 
xmlns="http://www.w3.org/1999/xhtml" as attribute to an element that's name 
is not html, and the document is invalid, but not causing any problem (XML 
Schema I think will avoid that kind of problem). Add topmargin="2" to the 
body element and the document is **** (insert your favorite rude four letter 
word ;-). I'm much more concerned about well-formedness and logical 
correctness in that case. And a not well-formed document can never be an 
XHTML document because it must be an XML document in the first place.

> > Of course, many problems will arise if they have not been considered
> > in the first place. HTML DOM seems not to apply to current
> > implementations of application/xhtml+xml user agents, so XML DOM must
> > be used. Some differences are between HTML CSS and XHTML CSS in
> > current User Agents.
> >
> > What if all this has been considered?
> Then you are one in about 5,000,000 people. (I think that number is
> actually accurate. By it I mean that you are one of about hundred people
> world-wide who actually understand the problem and know how to avoid its
> pitfalls.)
Oh, thanks.

> If you are one of those few, very few, people, then sure, go ahead, use
> XHTML and send it as XML to some UAs and XHTML to others. But otherwise,
> you _will_ run into those many problems I listed, and you shouldn't be
> using XHTML as text/html.
> My document is aimed at the general authoring public, not the extremely
> rare uber-geeks of the Web world.
Okay. I should have considered that before, I was too narrow-minded. But on 
the other hand, I didn't want to shut my eyes on your document and say "that 
doesn't concern *me*, yeah, because I'm cool", I rather wanted to discuss 

> >> I don't know of _anyone_ who has switched or could switch from text/html
> >> to an XML MIME type without a single problem.
> >
> > I also had my problems, but I detected them *before* I published the
> > documents on the web.
> But that's not important. The point is you will hit problems, wherever you
> hit them, if you try to change to a correct MIME type. If you have
> 1,000,000 documents, as some big sites do, then that will cost real cash.
That's just a slight change in .htaccess, build.xml and transform.xslt :-)

> You are not a typical Web author.
That now is a real compliment :-)
Well, okay, and I admit I use vi improved as text editor on a Linux system 
configured to use UTF-8.

I'm convinced now.

> > I still can't see why I should not serve them as application/xhtml+xml to
> > application/xhtml+xml accepting ua's and text/html to the tag soup
> > browsers.
> _You_, specifically, are an such an extreme exception that the document
> doesn't even attempt to apply to you. For the few people who understand
> enough of the issues to know to serve XHTML to only some UAs, I wish all
> the best of luck.
> But for the real world, the issues I raised are very real issues that
> should be valid reasons for not using XHTML with text/html.
> I'll update my document to make this clearer.
And you really made me thinking about using two transformations, one which 
creates XHTML 1.1 / application/xhtml+xml for Mozilla and Opera, and one 
which creates HTML 4.01 Strict / text/html for the tag soup chaos.

I tought it'll take about 7 minutes per site to change this.

But then I just found a bug in Xalan. It won't transform to HTML (<xsl:output 
method="html"/>) when there's a namespace declaration in the source document 
(<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de">). But in XHTML 
Basic, which I use as input, there automatically is a namespace declaration 
on the html element.
And switching between Xalan, saxon etc. all the time isn't fun since the 
handling of namespace declarations is very different.

Christian Wolfgang Hujer
Geschäftsführender Gesellschafter
Telefon: +49  (0)89  27 37 04 37
Telefax: +49  (0)89  27 37 04 39
E-Mail: Christian.Hujer@itcqis.com
WWW: http://www.itcqis.com/
Received on Saturday, 14 December 2002 13:01:08 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:01 UTC