Re: review of content type rules by IETF/HTTP community from Robert Burns on 2007-08-20 (public-html@w3.org from August 2007)

From: Robert Burns <rob@robburns.com>
Date: Mon, 20 Aug 2007 11:28:02 -0500
To: James Graham <jg307@cam.ac.uk>
Cc: Julian Reschke <julian.reschke@gmx.de>, Dan Connolly <connolly@w3.org>, "public-html@w3.org WG" <public-html@w3.org>
Message-Id: <18ED0482-74E0-48A8-9D15-F9938578565D@robburns.com>

Hi James,

On Aug 20, 2007, at 10:55 AM, James Graham wrote:

>
> Julian Reschke wrote:
>> It would be really nice if there'd be a simple way *for us* to get  
>> a feeling how big of a problem this is in practice. So I'd really  
>> like to have a browser that allows me to opt-out of sniffing, or  
>> that minimally informs me about these kinds of problems.
>
> It seems like this wouldn't be too hard to gauge by implementing  
> the algorithm as specced as spidering the web for documents labeled  
> as text/plain and looking for the fraction that are not sniffed as  
> text/plain by the algorithm.
>
> In general I would not expect any mainstream browser to expose an  
> option to turn off content sniffing in the UI since it would not be  
> understood by the vast majority of users.

I understand what you're saying. I think many browsers would be  
reluctant to do that. However, I could see allowing users to change  
the MIME type treatment of a resource after its loaded. This would be  
especially useful for those cases where content-type header was meant  
by the author to be authoritative and sniffing gets in the way. A  
browser could provide a contextual menu on embedded content and a  
popup menu in a table view of resources to expose a selection between  
the authoritative type and the sniffed type. A third selection could  
enable an advanced view to select any IMIME type the UA could handle.

To me this would not be any more intimidating or unusual than having  
a list of text encodings that most users have no idea what to do  
with. Right now every browser, even Safari (which tries the hardest  
to keep it simple) has this long intimidating menu to handled charset  
encoding. Again, most users have no idea what to do with that. You  
have to have some sense of what an encoding is to select an encoding.  
How many English speakers know they should select something with  
Latin in the name? Or even worse, in terms of recognition, UTF?  
Having a menu selection of two items strikes me as rather simple in  
comparison. This could even allow a mistakenly interpreted CSS file  
to be applied to the document. Or enable a script that was treated as  
a text file. This is something we could recommend to UA implementors  
with a SHOULD norm: in other words provide the UI needed to make this  
selection on all resources.

Secondly, related to this is the issue Sam Ruby has raised regarding  
viewing XML (and this could just as easily apply to HTML) as a tree  
or with styles applied. I don't think any HTML recommendation ever  
included norms to show the source text of an HTML file, but it became  
the standard practice. I think it would be a good idea for us to  
include a norm that UAs SHOULD allow users to view the content as an  
document/element tree as browsers now do with unknown XMLs. This  
seems like a more valuable feature than viewing the raw source.

The other option already mentioned, is to add either a new content  
header or a new flag to the existing content header to insist this is  
a more authoritative header. Then we'll try not to mess that one up.  
Add me to the list of support for that idea. Though I agree that  
should be done through the HTTP WG.

Take care,
Rob

Received on Monday, 20 August 2007 16:28:17 UTC