- From: Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com>
- Date: Thu, 23 Nov 2006 11:43:25 +0000
- To: www-html <www-html@w3.org>
- Cc: Shane McCarron <shane@aptest.com>, Tina Holmboe <tina@greytower.net>
On Tue, 2006-11-21 at 14:02 -0600, Shane McCarron asserted: > If a user agent claims to support application/xhtml+xml then you > SHOULD send your XHTML 1 document using that media type, and all will > be well. What do you mean "and all will be well"? Internet Explorer accepts application/xhtml+xml but (as Tina Holmboe hints) usually offers to download such documents instead of rendering them. Lynx, ELinks, Konqueror, and even Emacs/W3 (with Raman's patch) accept application/xhtml+xml, but their XHTML handling is only a broken variation of their HTML handling. Safari has always accepted application/xhtml+xml, but its support was considered dangerously buggy until more recent WebKit builds. Mozilla accepts application/xhtml+xml but "incremental loading of XML documents has not been implemented" so there is no "incremental display": http://www.mozilla.org/docs/web-developer/faq.html#accept Your choice of language oversimplifies the complexities of the Accept header. The acceptance of a type via the Accept header need not indicate "support" (in the sense of ability to render), only "acceptance" for download: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.1 HTTP 1.1 seems to offer no obvious way to make a distinction between "accepting" a media type for: 1) Direct use (e.g. text/html) 2) Indirect use via a plugin (e.g. video/quicktime) 3) Opening by the user agent in an external program 4) Opening by the user themselves in an external program 5) Download for eventual use on another system altogether Purposes 1 to 3 (at least) are given implicit sanction by the specification: > A user agent might be provided with a default set of quality values > for certain media ranges. However, unless the user agent is a closed > system which cannot interact with other rendering agents, this default > set ought to be configurable by the user. Purpose 1 involves only a few types, purpose 2 significantly expands the range, purposes 3 and 4 include a wide variety of types, and purpose 5 involves a potentially infinite number of types. Lynx and Links ignored purpose 5 and attempted to actually list the types supported by the user's system. Whenever this list approached accuracy, it became extremely long, prompting complaints from users about wasted bandwidth, privacy intrusions, and servers at Dogpile and Google rejecting GET requests for fear of buffer overrun attacks: http://lists.gnu.org/archive/html/lynx-dev/1997-09/msg00668.html http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=41594 http://linuxfromscratch.org/pipermail/links-list/2001-December/001589.html http://lists.gnu.org/archive/html/lynx-dev/2004-05/msg00019.html http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=254515 Indeed these problems were foreseen by the original specification: http://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html#sec12 In practice, therefore, purposes 3 and 4 also require most browsers to "accept" all types. Certainly, for purposes 3 to 5, it is appropriate for most browsers to accept application/xhtml+xml. But the specification also encourages user agents to specify a q (quality) parameter for different media types in order to express their preferences. By (probably correctly) prioritizing purposes 1 and 2 over 3, and 1, 2, and 3 over 4 and 5, sensible browsers use such expressions of preference to send a /hint/ about support to servers. I realize you probably had this hint in mind when you mentioned "support", but I think it's worth clarifying anyway. Internet Explorer isn't wrong to accept application/xhtml+xml with its requests. What /is/ astonishingly stupid is that it expresses no preference for text/html or against application/xhtml+xml using the q parameter. Instead we get: > Accept: */* The IE Team are aware that this complicates content negotiation: http://blogs.msdn.com/ie/archive/2005/04/27/412813.aspx#412893 http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx#468070 http://www.microsoft.com/windowsxp/expertzone/chats/transcripts/06_1012_ez_ie.mspx But the best they can say is that it "might" be addressed in Internet Explorer 8: http://blogs.msdn.com/ie/archive/2006/10/17/accept-language-header-for-internet-explorer-7.aspx#841795 Firefox's Accept header appears saner: > text/xml,application/xml,application/xhtml > +xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 But the message is more subtle than it first appears, since the preference for application/xhtml+xml was only added "in order to enable the serving of MathML to both Mozilla and IE with Apache without scripting back when the MathPlayer plug-in for IE did not handle application/xhtml+xml": http://www.mozilla.org/docs/web-developer/faq.html#accept For a time, WebKit's developers copied Internet Explorer's Accept header; now they copy Mozilla. You talk in terms of browsers who "lie" and Tina Holmboe of whether we can "believe" the Accept Header. While we should treating deceptive headers as browser bugs and pressurize developers to improve them, it might be more helpful in the present circumstances to think of reading the Accept header as an art of informed interpretation than in terms of a binary opposition of faith or scepticism. Recall that server-driven content negotiation is only a "best guess": "an origin server is not limited to these dimensions [information provided by the Accept, Accept-Charset, Accept-Encoding, Accept-Language, and User-Agent headers] and MAY vary the response based on any aspect of the request, including information outside the request-header fields or within extension header fields not defined by this specification." http://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html#sec12 Shane McCarron continued: > If a user agent only claims to support text/html, then you SHOULD send > your document using that media type. If your document is written in > XHTML 1.0 and follows the guidelines in Appendix C, you can do this > with the same document and you will be largely successful. In fact, > and without any evidence to back this up, I would bet that such a > document is almost exactly as likely to render correctly as if you > sent it with the HTML 4.01 DOCTYPE. Do you include or exclude styling added by CSS and behaviours added by scripts when you talk of success and rendering? What sort of breakage is covered by the words "largely" and "almost"? Also, how much would you like to bet? ;) More seriously, /which/ guidelines need to be followed for such success? If we follow C.14 and use an xml-stylesheet processing instruction, then serve that document as text/html, Internet Explorer 7 renders in broken (quirks) mode not standards mode. Here's the actual example document from C.14: http://www.benjaminhawkeslewis.com/www/web-design/c14-test.html You can test which rendering mode Internet Explorer 7 is using by entering: > javascript:alert(document.compatMode); into the address bar, as described at: http://css-discuss.incutio.com/?page=RenderingMode > I surely hope not, but if there are.... they deserve what they get. This may be a comfortable attitude for a W3C specification writer. It is not a luxury that the majority of would-be XHTML authors can afford, not only on account of commercial incentives, democratic accountability, and general politeness, but also (arguably) on the grounds of accessibility. WCAG 1.0 urges us to be considerate towards users "who may have an early version of a browser, a different browser entirely, a voice browser, or a different operating system" and repeatedly allows exceptions to its own guidance "until most user agents readily available to their audience include the necessary accessibility features": http://www.w3.org/TR/WAI-WEBCONTENT/#Introduction W3C can improve this situation, if it chooses, by: 1) Changing the note on media types to a recommendation, publishing a standard for how web user agents should use, and web servers should interpret, the content negotiation headers, making compliance with that standard a checkpoint in the User Agent Accessibility Guidelines (UAAG), and publicising the issue among public and private IT procurers. 2) Campaigning for users to adjust their own request headers to match that standard. Microsoft and Apple could (perhaps) be pressurized into distributing fixes via automatic updates. Or, if they proved recalcitrant, W3C could code free utilities to fix their request headers themselves. In the case of Internet Explorer, to alter the Accept header is a minor registry tweak. Pace Tina Holmboe, these growing pains are not in themselves a reason to abandon forever XML-based markup, especially as XHTML-only user agents are already emerging. On the contrary, the increasing diversity of the web and its users necessitates perfection of content negotiation methodologies and offers yet another incentive to support FOSS assistive technology projects like Fire Vox, NVDA, and OSK-ng: http://www.firevox.clcworld.net/ http://www.kulgan.net/nvda/ http://elgg.net/stevelee/weblog/139997.html The net effect of such projects is that the technical and monetary obstacles to even users with disabilities adopting more capable browsers are increasingly being reduced to: 1) the existence of intranet web applications that rely on proprietary features in HTML-only clients; 2) the ambitious hardware and software requirements of XHTML-capable graphical browsers (Xubuntu and similar free *nix projects might help solve this problem, however). If W3C got serious about encouraging a transition to XML-based markup, they could also explicitly include compliance with the XHTML, SVG, XForms, and MathML specifications (currently not even Mozilla manages to support all of these out of the box, let alone fully comply) as a checkpoint in UAAG, and publicise that requirement with IT procurers too. Still, while continuing to hope for more radical action, I welcome the decision to consider making Appendix C clearer. -- Benjamin Hawkes-Lewis
Received on Thursday, 23 November 2006 11:50:08 UTC