- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Mon, 12 Jul 2004 06:14:18 +0200
- To: www-html-editor@w3.org
Dear HTML Working Group, RFC 2854 states in section 2, [...] The text/html media type is now defined by W3C Recommendations; the latest published version is [HTML401]. In addition, [XHTML1] defines a profile of use of XHTML which is compatible with HTML 4.01 and which may also be labeled as text/html. [...] Section 5.1 of the XHTML 1.0 Second Edition Recommendation states: [...] XHTML Documents which follow the guidelines set forth in Appendix C, "HTML Compatibility Guidelines" may be labeled with the Internet Media Type "text/html" [RFC2854], as they are compatible with most HTML browsers. [...] Section 3.1 of the XHTML Media Types Note states: [...] [XHTML1], Appendix C "HTML Compatibility Guidelines" summarizes design guidelines for authors who wish their XHTML documents to render on existing HTML user agents. The use of 'text/html' for XHTML SHOULD be limited for the purpose of rendering on existing HTML user agents, and SHOULD be limited to [XHTML1] documents which follow the HTML Compatibility Guidelines. [...] So it seems crystal clear to me that this Appendix C of the XHTML 1.0 Second Edition Recommendation defines clear conformance criteria for data objects which I would expect to be reliably machine-testable. It however turns out that a number of sections of this appendix does not deal with such conformance criteria, starting with Appendix C.1 [...] Be aware that processing instructions are rendered on some user agents. Also, some user agents interpret the XML declaration to mean that the document is unrecognized XML rather than HTML, and therefore may not render the document as expected. For compatibility with these types of legacy browsers, you may want to avoid using processing instructions and XML declarations. Remember, however, that when the XML declaration is not included in a document, the document can only use the default character encodings UTF-8 or UTF-16. [...] These appear to be at best criteria for authors, i.e., only authors aware of this problem may deliver XHTML documents to legacy user agents. So it seems I might misunderstand the purpose of the Appendix and all the documents that refer to it. Which seems a bit odd. Ignoring the sections that seem misplaced, the remaining sections are often not clear about what the actual requirements are, or what the exact requirement level is. Some sections use RFC 2119 keywords such as SHOULD and MUST, some use loose imperative statements such as "avoid". It is not clear to me how to map these statements into a precise error report, i.e., what maps into clear errors, warnings or something looser such as an informational hint. It also seems inconsistent that you reference the appendix as defining a profile, and yet state that the appendix is informative. It also seems that many requirements are missing from this "profile", for example XHTML documents that use an internal subset will most likely break in a legacy user agent as it would show the end delimiter ]> as textual content, rather than hide it as I think would be required for both compliant HTML and XHTML user agents. So I am not even sure what the actual scope of the Appendix is to correct such flaws, if there is actually anything wrong with it omitting such issues, myself. So it seems close to impossible to write a good software tool that checks whether a data object meets the constraints "defined" in that "profile". Such a tools is however an often requested feature for the W3C MarkUp Validator, as it, at least apparently, concerns the compliance of documents. There is even special interest for authors who wish to make their content accessible. One problem here that gets more common every day is that documents are created using XML tools that create things like <a name="x" id="x" /> for which visual inspection does not necessarily reveal any difference, but A11y tools that rely on the internal document object model representation of the document will likely note it as it would likely break the document to some extend, see e.g. the Usenet discussion around http://groups.google.com/groups?selm=7n13605024figsokutl2qdsncpdfbk2g3a@4ax.com where in http://groups.google.com/groups?selm=40649cb1.16491112@news.individual.net we were able to identify the actual issue, after quite some effort that was necessary due to the lack of a tool that properly checks for such problems. I do not want to rely on wild guesses to write such a tool properly and later waste time to fix problems I introduced for this reasons, and take the blame for it once you clarify the unclear parts. Hence I chose for the moment to wait until you clarify the issues I have raised in the XHTML 1.0 Errata and a later XHTML 1.0 Third Edition. I hope this will happen soon. regards.
Received on Monday, 12 July 2004 00:15:02 UTC