Re: text/html for xml extensions of XHTML from Ian Hickson on 2001-05-04 (www-talk@w3.org from May to June 2001)

From: Ian Hickson <ian@hixie.ch>
Date: Thu, 3 May 2001 20:33:00 -0700 (Pacific Daylight Time)
To: Ian Hutchinson <hutch@psfc.mit.edu>
cc: <mozilla-mathml@mozilla.org>, <www-talk@w3.org>
Message-ID: <Pine.WNT.4.31.0105032015210.988-100000@HIXIE.netscape.com>

On Fri, 4 May 2001, Ian Hutchinson wrote:
>
> Let's try to get some facts into this discussion.
>
> Fire up mozilla 0.8.1 and visit the URL(s)
> http://hutchinson.belmont.ma.us/tth/htmltab.html and all permutations
> replacing "html" with "xml". The document codifies the results. What this
> test shows is:
>
> 1. Mozilla already routinely DOES snooping in the document header, notably
> the DOCTYPE, that changes its rendering, when the document is served as
> HTML. This fact renders Hickson's many remarks about the excessive
> computational cost of snooping irrelevant (to put it charitably).

Wrong type of snooping. Snooping to decide *rendering* is indeed easy and
relatively cheap. Snooping to decide *parsing* is a much more sensitive
issue. Secondly, the parsing Mozilla does to sniff for the rendering mode
is remarkably simple [1]. Apart from the idea of a magic comment header
and the idea of using the XML PI, both of which have problems as I have
remarked in the recent past on this list, all other types of sniffing that
have been suggested are extremely involved.

> 2. When Mozilla receives a document served as XML its behaviour does not
> seem to depend on the DOCTYPE.

Indeed. Mozilla always handles XML in "Standard" mode.

> [3. Both Mozilla, when rendering a document it takes to be XML, and Amaya
> have broken table renderers.]

I didn't test Amaya, however Mozilla actually only does a *correct*
rendering in "Standard" mode (when sent as text/xml, or when sent with
an HTML4.01 Transitional DOCTYPE with a URI). The result you are expecting
are very likely to be not what the spec says because that document uses
"colspan=0" which is not handled correctly in older browsers.

> I assume that conclusion 1 above shows that it ought to be fairly trivial
> for Mozilla to implement the detection of XML documents served as HTML on
> the basis of their DOCTYPE, and enable the MathML parser for them.

DOCTYPEs are optional for well-formed XHTML documents. New DOCTYPEs get
added over time. DOCTYPE parsing is hard. DOCTYPEs may be hidden in
comments. DOCTYPE sniffing has been called harmful by many leading figures
at the W3C and elsewhere.

Overall, DOCTYPE sniffing is a poor solution to a problem which has
already been solved by a new MIME type.

-- Footnotes --

[1] And broken. One of the reasons I am so against using yet more sniffing
is that implementing "quirks mode" vs "standard mode" sniffing based on
the content of documents has proved to be hard, unreliable, and has caused
numerous problems of its own. The only reason Mozilla still has it is to
support the large quantity of legacy content that depends on non-standard
behaviour. With XML, since it is a new technology, there should be no
reason to support non-standard usage.

-- 
Ian Hickson                                            )\     _. - ._.)   fL
Invited Expert, CSS Working Group                     /. `- '  (  `--'
The views expressed in this message are strictly      `- , ) -  > ) \
personal and not those of Netscape or Mozilla. ________ (.' \) (.' -' ______

Received on Thursday, 3 May 2001 23:30:20 UTC