- From: Robert J Burns <rob@robburns.com>
- Date: Mon, 7 Jul 2008 15:22:34 +0300
- To: HTML WG <public-html@w3.org>
I think Microsoft’s proposed solution (authoritative=true) could work as a stop-gap measure, but I think we need to think about a significantly different approach entirely. For example, I think HTML should have its own mechanism for setting the processing of embedded resources. I've proposed just such a mechanism in bugzilla[1]. I think we need to look at this with fresh eyes. The http content-type header was intended to serve double duty. First it provides access to mime types without needing to retrieve the entire resource to perform sniffing or otherwise examine the resource. Second it served as a mechanism for authors to alter the MIME type treatment of a file. There are problems with combining these two roles into one. There are also problems with not including such a mechanism within HTML itself. Some of those issues are covered in a wik page[2] on the topic. Ideally, agents should be able to query the intrinsic type of resources across the network without needing to retrieve the resource. Also authors should be able to use the same resource with the same resource identifier to alter the treatment of a resource. The http content type header cannot serve both of these functions at the same time. It's time to have new headers and other new mechanisms to address all of these issues. Add to these problems the fact that http content type headers cannot address, the issue of compound document types (multiple parts, etc), and content type headers again cannot meet the needs of modern resources. What I think we need is 1) an entirely new http header (and this is probably something for the http wg to consider) that can return an array of intrinsic content types for each resource (perhaps the sniffing code could be moved from the open source browser projects to the open source server projects to generate this header) and 2) a separate header for author control over the processing of a resource. However, this second function should not be needed for HTML since HTML should include its own attributes for controlling the processing of resources (as proposed in bugzilla). Together these mechanisms address the problems identified in the wiki. Finally, consider the problem that apache still has a long-standing bug that makes it impossible to configure the server to return no content type header when the content type of a file is unknown. This is over a decade after the spec and the creation of apache. Certainly apache addressed a need to handle files with no filename extension and send permit administrators to configure the server to send text/plain in such circumstances (as Roy Fielding has pointed out on numerous occasions[3]). However, apache goes further and sends "text/plain" for every unknown (unmapped) filename extension. Basically httpd's DefaultType should not even exist and instead there should be a setting to sniff extension-less filenames for text/plain type. Nevertheless, this long history created some of the need for client UA sniffing in the first place and I'm afraid I don't see a way back to no sniffing given this history. The only way out now is to come up with new replacement mechanisms to achieve the goals originally set for the http content type header. In summary, we need: * an http mechanism for discovery of the intrinsic type of resources including an array of multiple types in the case of multipart of compound documents * an HTML mechanism for controlling the processing of resources * perhaps an http mechanism also for controlling the processing of resources, but not for use in HTML Take care, Rob [1]: <http://www.w3.org/Bugs/Public/show_bug.cgi?id=5776> [2]: <http://esw.w3.org/topic/HTML/ContentTypeIssues> [3]: <http://lists.w3.org/Archives/Public/public-html/2008Jul/0038.html> On Jul 6, 2008, at 1:40 PM, Julian Reschke wrote: > > Ian Hickson wrote: >> ... >> If you would like the document to be processed as plain text, then >> there >> might not be a good answer for you, sorry. Your use case is >> incompatible >> with the use case of the many users who want to see feeds sent as >> text/plain handled as feeds. Enough people mislabel their feeds as >> text/plain that in practice documents labeled as text/plain are, in >> some >> browsers, sniffed for feeds before being treated as plain text. >> ... > > With the current text in HTML5, there's not only no "good answer" > but no > answer at all (except by telling users to configure their UAs to > respect > mime types). > > Sam's use case could be made compatible by making the response > distinguishable from one sent by a misconfigured server. > > At this point it seems to me that you are simply not interested in > that > case. Is this correct? > > BR, Julian > >
Received on Monday, 7 July 2008 12:23:17 UTC