W3C home > Mailing lists > Public > www-validator-cvs@w3.org > August 2006

[Bug 1500] XHTML-sent-as-text/html is parsed as XML

From: <bugzilla@wiggum.w3.org>
Date: Tue, 15 Aug 2006 05:30:01 +0000
To: www-validator-cvs@w3.org
Message-Id: <E1GCrV3-0002m3-PQ@wiggum.w3.org>


------- Comment #1 from ot@w3.org  2006-08-15 05:30 -------
(In reply to comment #0)
> According to the HTML WG, a UA is non-compliant if it handles an XHTML document
> sent as text/html as XHTML; such a UA must apparently handle the document as
> HTML regardless of what it looks like.

Could you give more precisions of what you mean by "treat as HTML" in the
context of formal DTD validation, which is what the validator does?

> # [...] documents served as text/html should be treated
> # as HTML and not as XHTML.
>  -- http://lists.w3.org/Archives/Public/www-html/2000Sep/0024.html

Any normative reference would be more appreciated. I don't think it's a good
idea to base the validator's behavior on a message in a w3c mailing-list.

> The fact that the validator ignores this means that documents that don't comply
> to appendix C of XHTML 1.0 are being marked as valid when in fact they aren't
> conformant and won't be handled correctly.

Notwithstanding the fact that, last I checked, the appendix C of XHTML 1.0 was
an informative set of guidelines, checking documents against these guidelines
is the work of a full checker (beyond conformance), and that's what is being
developed at this point in time. The "unicorn" [1][2] project has a plug-in
checking appendix C rules.

> I would like to see the validator reject any XHTML-sent-as-text/html as being of
> the wrong MIME type.

I do not see a direct link between the rest of your comment and the conclusion
that XHTML-sent-as-text/html should be plain and simply rejected. Are you
suggesting that it should "be treated as HTML", "checked against appendix C
rules", or "rejected". Please precise your request.

One additional possibility I see would be to add a warning to the validator
output whenever XHTML 1.0 is found served as text/html, and even that is
arguable, as I don't think this is compatible with section 5.1 of the XHTML 1.0
specification - http://www.w3.org/TR/xhtml1/#media (I admit being confused by
why the normative section 5.1 of the spec seems to refer to the informative
appendix C, but that may just be me misunderstanding the specification).

Until we have the unicorn tool ready for prime time, my proposed solution is
that whenever the validator finds an XHTML 1.0 doctype document served as
text/html, it adds a note to its output encourageing the author to check their
documents against the appC checker. 

Would that be an acceptable solution?

Also, please feel free to send in test cases, as well as patch proposals, which
would help us treat your request quickly.

Thank you.
Received on Tuesday, 15 August 2006 05:30:13 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:17:25 UTC