Re: Mimetype: application/xhtml+xml -- add to validator? from William F. Hammond on 2001-05-29 (www-html@w3.org from May 2001)

From: William F. Hammond <hammond@csc.albany.edu>
Date: Tue, 29 May 2001 10:39:05 -0400 (EDT)
To: www-html@w3.org
Message-Id: <200105291439.f4TEd5w19368@pluto.math.albany.edu>
BTW why didn't the Baker draft propose text/xhtml+xml or perhaps
propose both text/xhtml+xml and application/xhtml+xml ?  Given the
nature of XHTML it seems to me that text/xhtml+xml would be more
consistent with the distinction made between text/xml and
application/xml in RFC 3023 (Murata, St.Laurent, Kohn: XML Media
Types).

Terje Bless <link@tss.no> writes:

> At the moment, XHTML does not exist as far as MIME is concerned, except
> insofar as it conforms to the backwards compatibility guidelines; in which
> case it should be labelled as "text/html" and validate as such.

          ^^^^^^

Let's be clear that in the XHTML 1.0 recommendation
            http://www.w3.org/TR/2000/REC-xhtml1-20000126
the verb is "may", and indeed it must be "may" if the referenced
Appendix C (containing guidelines for authors) is informative rather
than normative).

Nonetheless as a guideline for a validator "should" is probably
correct.

In fact, given that a non-cheating validator needs to perform
preliminary SGML declaration triage as a function of the declared
document type on something served as text/html there is absolutely no
reason why a validator cannot smoothly handle XHTML, even Murray
Altheim's "XHTML 1.1 plus MathML 2.0", through text/html without
missing a beat.

The real question is whether we can get massive user agents claiming
to understand both "DTD HTML" and "DTD XHTML" to perform appropriate
preliminary parsing triage.

If that can be achieved, then there are a number of advantages to be
gained in making it possible for XHTML in general to be served either
as text/html or as text/xml or as one of a number of specialized
types such as application/xhtml+xml.

The HTML WG could help here by tightening what needs to be at
the top of an XHTML document when served as text/html.  (However,
section 5 should still say "may", and Appendix C should remain
informative, directed toward authors.)

For example, the *first non-blank line* should match one of the
following regular expressions:

A)  '^ *<\?xml version='

B)  '<!DOCTYPE html .*//DTD XHTML'

C)  '<html xmlns='

For text/html if the first line does not match one of these, the user
agent should assume classical HTML.  If the first line does match one
of these, and the document is not conforming XML that makes sense as
XHTML, the user agent should refuse it.

(I say this after observing XHTML plus Math documents in older
browsers.  Yes, only the stripped content of the math tags is seen,
and the math does not make sense.  Everything else is fine.  In fact,
it is much better than the situation with character sets that are not
locally available.  So the user who cares will do something about it,
and the web will move forward.)

                                    -- Bill


William F. Hammond                   Dept. of Mathematics & Statistics
518-442-4625                                  The University at Albany
hammond@math.albany.edu                      Albany, NY 12222 (U.S.A.)
http://www.albany.edu/~hammond/                Dept. FAX: 518-442-4731

Never trust an SGML/XML vendor whose web page is not valid HTML.
And always support affirmative action on behalf of the finite places.
Received on Tuesday, 29 May 2001 10:39:45 UTC