RE: HTML or XHTML - why do you use it? from Ian Hickson on 2003-01-08 (www-html@w3.org from January 2003)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 8 Jan 2003 00:57:57 +0000 (GMT)
To: "Peter Foti (PeterF)" <PeterF@SystolicNetworks.com>
Cc: "'www-html@w3.org'" <www-html@w3.org>
Message-ID: <Pine.LNX.4.21.0301080001310.25702-100000@dhalsim.dreamhost.com>
On Tue, 7 Jan 2003, Peter Foti (PeterF) wrote:
> > > 
> > > As the XHTML recommendation stated, XHTML documents are intended to
> > > operate in HTML 4 conforming agents.
> > 
> > This isn't quite accurate -- XHTML documents (or rather, Appendix C
> > compliant XHTML 1.0 documents) are intended to operate in HTML Tag
> > Soup parsers. Strictly speaking, a compliant implementation of HTML
> > 4.01 would be well within its rights to totally reject an XHTML
> > document, since XHTML documents are not valid HTML 4.01.
> 
> Perhaps.  But considering the leniency that all current agents seemt to
> offer, this is somewhat of a non-issue, is it not?

I was just correcting your statement.


> Are there any browsers today that would reject non-valid HTML 4.01?  
> I don't believe there is.

I know of one (the validator).


> > It's not -- XML doesn't have any content model which allows comment-
> > like markup to be ignored. Don't forget in XML parsers should get the
> > same result whether or not they parse the DTD (with a few exceptions
> > related to attributes and entities).
> 
> I see.  However, since we are talking about parsing XHTML as HTML, I don't
> think this matters because the agent will still treat it as an HTML comment.

Eh?

The problem is that the following string:

   <script> <!--
     work();
   // --> </script>

...will be treated differently depending on whether it is supposed to be
HTML or whether it is supposed to be XHTML.

So when a document containing the above has its MIME type changed from
text/html to application/xhtml+xml, it'll break.


> But do any agents support the SHORTTAG feature?

Emacs/W3, I think. There was talk of implementing it in Mozilla, too.


> I'll take that as a compliment then. :)  But don't you think the focus
> should be on improving the quality of the existing developers rather
> than to say "Existing developers are too stupid to us XHTML, so they
> shouldn't?"

I think everyone should use XHTML. But ONLY if they use the
application/xhtml+xml MIME type.


> > Why not just use HTML?
> 
> Because I want the benefits of using XML tools and validators.  Not to
> mention the experience of writing valid XML.

What about the benefits of SGML tools and validators, not to mention the
experience of writing valid SGML?

I agree, on the long run, XHTML-as-XML is better. On the short run,
though, we're simply not there. (Largely because of the IEs.)


> Ok, I'll admit you are right here.  Eventually, if one intends to move from
> serving HTML documents to XML documents, this problem will arrise.

And that is pretty much my only point. :-)


> > > If they did, then the XML tool would have to guess where elements
> > > ended if they re-opened the generated HTML file.
> > 
> > So why not use the SGML tools that have existed since before XML was
> > even an inkling in anyone's eye?
> 
> Because they are not as strict as XML tools and can produce sloppy code?

No such thing as sloppy SGML code. It's either valid or it isn't.

In fact, XML _introduced_ the idea of sloppy code (well formed but not
valid markup is purely an XML concept).


> > What other advantages are there?
> 
> Besides being able to use XML tools, it also gives authors experience
> writing *better* documents that are more structured.

There is a direct one to one mapping of canonical valid XML documents to
canonical valid SGML documents, so they can't be more structured.


> I guess my argument is that developers should be trained to use XHTML
> *correctly*, and your argument seems to be that not enough people use
> XHTML correctly so therefore those people should not use it at all.

Who are you proposing do this training?

The only way I can see of training people to use XHTML is to make the UAs
_require_ it to be well formed. That is what I've been personally working
on making happen with, e.g., my QA work on Mozilla. However, in the mean
time, until we get decent support for XML in the market, there's no-one
doing the training.


> > If the document validates, there is no ambiguity about where the
> > elements end. It is fully defined.
> > 
> > For example:
> > 
> >    <p>Test<ol><li></ol>
> > 
> > ...is _exactly_ equivalent to:
> > 
> >    <p>Test</p><ol><li></li></ol>
> > 
> > ...and all UAs support this correctly as far as my testing has shown.
> 
> That would be nice... but Netscape 4 has proven you wrong. :) 

Netscape 4 gets perfectly well formed markup wrong as well, so it really
isn't a good example.


> > Basically, my argument is that if you know what you're doing, then
> > sure, go ahead, but that most people don't, and that for them it would
> > be a lot easier if they used HTML 4.01 now and thus were never tempted
> > to convert these documents to an XML MIME type.
> 
> You don't think it would be better for those people to simply learn XHTML?

Get real, who is going to teach them?

It's like crime -- sure, I would rather teach everyone to be nice to each
other, but in the meantime, we still need car alarms.

-- 
Ian Hickson                                      )\._.,--....,'``.    fL
"meow"                                          /,   _.. \   _\  ;`._ ,.
http://index.hixie.ch/                         `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 7 January 2003 19:58:05 UTC