- From: Adam Barth <w3c@adambarth.com>
- Date: Thu, 2 Apr 2009 16:22:14 -0700
- To: "Roy T. Fielding" <fielding@gbiv.com>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
On Thu, Apr 2, 2009 at 2:32 PM, Roy T. Fielding <fielding@gbiv.com> wrote: > Maybe the implementors of Imageshop will read the thread and > understand that the media type of a message is not the same > thing as the data format of a message. The media type (what was > called MIME type ages ago) is a processing instruction supplied > by the sender. The media type cannot be discerned by looking at > the bits. The data format can sometimes be discerned by looking > at the bits, which is a reasonable fallback behavior depending on > the context in which the request was made. You're ignoring the reality of existing Web content. To interoperate with existing Web content, a user agent must consider both the Content-Type headers and the content when determining the media type contained in a response. To claim otherwise is fantasy. > It is impossible to sniff for a media type because any given > data format matches at least two or more media types. This is true in general. However, existing Web content assumes that user agents will override their specified media types in certain cases. For example, suppose a user agent receives the following HTTP response from the Web. Content-Type: */* GIF89a.... If this user agent wishes to interoperate with this server, the user agent should use the media type image/gif when processing this request. > None of those > variables have anything to do with HTTP. HTTP is responsible > for communicating the sender's intentions. Currently, the HTTP spec ignores reality and forbids user agents from interoperating with existing Web content. > Then fix the content metadata. No other solution will work, period. I agree that servers should fix their metadata. However, not all servers will. If a user agent wishes to interoperate with these servers (as many do), then we should be helpful and explain how to do so in a reliable way. > We would be better off if none of them sniffed. That is the most > interoperable solution. Forbidding sniffing prevents user agents from interoperabling with existing Web content. >> I'm not proposing the spec describe the error handling quirks of >> browsers. I'm propose that the spec contain enough detail that >> implementors of future user agents (if they are so inclined) can >> determine the MIME type of HTTP responses from the Web. > > It already does contain everything that can be truly said about > determining the media type. The spec could provide a sniffing algorithm that allows user agents to interoperate with existing Web content. > The only thing it doesn't define is > what the recipient should do when it detects an error or when > no type is supplied, and the reason for that is because the behavior > is different for every single type of recipient. The type of recipient isn't really relevant. What's relevant is how existing servers expect their existing content to be interpreted. A server on the Web that specifies a Content-Type of "*/*" and a payload that begins "GIF89a" expects this response to be treated as image/gif. > The only reason > that the HTML5 folks can pretend to answer that question is because > they currently ignore the needs of all recipients other than the > big general-purpose browsers. Imageshop is not a "big general-purpose browser." > IETF concerns >> WHATWG concerns. I don't think it's helpful to frame this discussion in terms of identity politics. >> Sadly, such user agents will not be as popular as those that just work. > > That is a matter of opinion. I have seen no evidence to suggest > that MSIE bugs actually helped it in competing with other browsers. If you asked them, I'm sure implementors of non-MSIE user agents would tell you that sniffing is required to avoid losing user. For example, I recently received a bug report from a user who was unable to buy something at BestBuy.com because Chrome did not sniff aggressively enough in some corner case. Had we not fixed this issue, that user would have simply switched to another browser that worked. > I have seen plenty of evidence that MSIE is upgraded or uninstalled > on an institutional basis when its bugs create a liability. The > same will hold true for other browsers. Compatibility is widely documented to be one of the most important factor (if not THE most important factor) in determining whether a user will adopt a new browser. >> Be that as it may, as an implementor of a new user agent that would >> like to interoperate with the Web, I would like to know how to >> determine the MIME type of existing Web content. > > Read the Content-Type header field and behave accordingly. If it is > obviously in error, then work around that error while informing > the user. In order to interoperate correctly, I need to know HOW to work around the error. If we don't specify an algorithm that works, I'll have to reverse engineer other implementations. This will lead to further compatibility and security problems. >> You're entitled to that opinion, but I don't see content sniffing >> going away anytime soon. > > Time will tell. I only document the technical solutions that > actually work. The algorithm described in draft-abarth-mime-sniff is a technical solution that actually works. If a user agent implements that algorithm, the user agent can determine the media type of HTTP responses and interoperate with existing Web content. Adam
Received on Thursday, 2 April 2009 23:23:06 UTC