- From: olivier Thereaux <ot@w3.org>
- Date: Fri, 13 Jun 2008 12:50:31 -0400
- To: W3C Validator Community <www-validator@w3.org>, www-html@w3.org
Dear all, I am happy to announce the first release of a library/tool which helps web authors check for issues of compatibility with legacy HTML agents. This piece of software is based on a proof-of-concept CGI script built by Bjoern Hoehrmann a couple of years ago, then improved by the qa-dev team. I finally took some time to clean it up, modularize it, add a test suite and release this week. At the core of this tool is a perl library, based on an XML parser, which can observe any XHTML document and report potential issues if the XHTML is fed to HTML legacy agents. The library (and its documentation) are publicly available at: http://search.cpan.org/dist/W3C-XHTML-HTMLCompatChecker/ and source at: http://dev.w3.org/cvsweb/perl/modules/W3C/XHTML/HTMLCompatChecker/ The library also comes with a simple commandline/cgi script, which currently outputs either XHTML or a home-grown XML format. If there is some demand, adding a plain text output for command-line use would be trivial. For a simple demo, see: http://qa-dev.w3.org/appc/ also: % appCcheck.pl uri=http://www.w3.org/QA/ ... <p>No issue found in this document. Congratulations.</p> ... The tool has two modes: one where it will only check XHTML 1.0 documents served as text/html (and ignore anything else), and another mode where it can check any kind of XHTML for compatibility, regardless of doctype and media type. One of the ideas behind releasing such a library is to use it as a component in the W3C Markup Validator - as part of a deliberate strategy to make that tool less of a formal validator, and more useful for Web authors -. I would welcome opinions on how to best integrate the "html compatibility checks" in the validator, given that: * the HTML compatibility guidelines are informative. http://www.w3.org/TR/xhtml1/#guidelines I have long been confused by the fact that this (informative) appendix was refered to in a normative part of the XHTML1.0 spec (http://www.w3.org/TR/xhtml1/#media ) but have been told by Steven Pemberton (not on the public record, but that can be fixed here and now) that it was a mistake. * Due to the lack of support for the “proper” media type for XHTML (application/xhtml+xml) in the Internet Explorer family so far, XHTML is mainly served "as HTML" on the web today, and thus parsed as if it were HTML (and not XHTML) by most UAs. A lot of web authors also don't have any control of their web server, and would not be able to serve their content as application/xhtml+xml, even if they desired so. * The compatibility guidelines were designed as a "transition" mechanism for XHTML 1.0 only. However, a lot of authors have been using the "text/html" media type for any kind of XHTML, and there have been some discussions within the XHTML working group to update the message to "any HTML-compatible XHTML content MAY be served as text/ html". see e.g the *work in progress draft* at http://www.w3.org/MarkUp/2008/ED-xhtmlmime-20080423/ I wonder if the validator could: Q1: when Q1-1) check for HTML compatibility guidelines only for XHTML 1.0 content, served as text/html Q1-2) check for HTML compatibility guidelines for any XHTML served as text/html Q1-3) check for HTML compatibility guidelines for any XHTML regardless of media type. Q2: how Q2-1) check for HTML compatibility guidelines, and mark issues found as errors Q2-2) check for HTML compatibility guidelines, and mark issues found as warnings Q2-3) check for HTML compatibility guidelines, and mark issues found as info only Q2-4) check for HTML compatibility guidelines. Identify the most problematic issues, mark them as warnings, and mark the rest as info only. Q2-5) check for HTML compatibility guidelines as an option, ON by default Q2-6) check for HTML compatibility guidelines as an option, OFF by default Given the above considerations, my preference currently hovers around Q1-3) and Q2-3). I think that if the validator mentions HTML compatibility issues as "info", and does so for any XHTML content, it would probably benefit a lot of people, while avoiding getting some people angry because the validator dared output a warning about a once- spotlessly-validated page. Thoughts on this tool and how we could best integrate it would be welcome, in particular from members of the XHTML and HTML working groups. Thank you, olivier -- olivier Thereaux - W3C - http://www.w3.org/People/olivier W3C Open Source Software : http://www.w3.org/Status
Received on Friday, 13 June 2008 16:51:09 UTC