Re: Lack of sniffing of text/plain for non-binary content

On Sep 9, 2007, at 4:51 AM, Julian Reschke wrote:
> Geoffrey Sneddon wrote:
>> Hi,
>> Currently we only sniff text/plain (in certain conditions, being  
>> there is no content-encoding headers and content-type is equal to  
>> one of "text/plain", "text/plain; charset=ISO-8859-1", or "text/ 
>> plain; charset=iso-8859-1") to see whether it is binary content or  
>> not. However, this poses issues for a large number of feeds that  
>> are served as text/plain: a notable example of this is <http:// 
>> youtube.com/rss/global/top_favorites.rss>.
>
> How do you distinguish between a feed that is served as text/plain  
> because the authors wants to have it handled as plain text, as  
> opposed to a mislabeled feed?

As someone who works on a spider that uses feeds heavily, the only  
way we've found to make it work is to always assume that if it looks  
like a feed, it should be treated as such. Interactive user agents  
may have different constraints that lead to different solutions.

-ryan

Received on Sunday, 9 September 2007 21:50:53 UTC