Re: HTML5 vs content type sniffing from Robert Siemer on 2008-01-29 (ietf-http-wg@w3.org from January to March 2008)

From: Robert Siemer <Robert.Siemer@backsla.sh>
Date: Tue, 29 Jan 2008 13:23:25 +0000
To: Stefan Eissing <stefan.eissing@greenbytes.de>
Cc: ietf-http-wg@w3.org
Message-ID: <20080129132327.GI1632@polar.elf12.net>

On Tue, Jan 29, 2008 at 11:53:49AM +0100, Stefan Eissing wrote:
> 
> In the case of "text/plain" apache httpd is still (2.2.6) shipping  
> with DefaultType set to it, ignoring the rules set up by RFC 2616  
> (which seem to be unchanged in httpbis as far as i can see). So, if  
> the apache defaults are changed, will whatwg have to change the  
> sniffing "standard"? most likely.

The whole "text/plain" vs. "application/octet-stream" sniffing as 
mentioned in html5 is brain dead.

The algorithm shown there has no stable results e.g. regarding content 
served as "text/plain", because it works with the first N bytes 
available. - The browser always has the excuse of not having any bytes 
of the entity (--> result is "text/plain"). Using a larger N, the larger 
the likelyness of recognizing e.g. random data as 
"application/octet-stream".

But whole sniffing is useless, because the html5 authors do not say 
what do to with the one or the other. - The real difference in todays 
browsers is

a) show it
b) download it

Something the content author wants to dictate, something that is 
absolutly independent of the byte representation. Something that the 
proposed algorithms is not able to detect: semantics.

Robert

Received on Tuesday, 29 January 2008 13:25:17 UTC