On Sep 9, 2007, at 4:51 AM, Julian Reschke wrote: > Geoffrey Sneddon wrote: >> Hi, >> Currently we only sniff text/plain (in certain conditions, being >> there is no content-encoding headers and content-type is equal to >> one of "text/plain", "text/plain; charset=ISO-8859-1", or "text/ >> plain; charset=iso-8859-1") to see whether it is binary content or >> not. However, this poses issues for a large number of feeds that >> are served as text/plain: a notable example of this is <http:// >> youtube.com/rss/global/top_favorites.rss>. > > How do you distinguish between a feed that is served as text/plain > because the authors wants to have it handled as plain text, as > opposed to a mislabeled feed? As someone who works on a spider that uses feeds heavily, the only way we've found to make it work is to always assume that if it looks like a feed, it should be treated as such. Interactive user agents may have different constraints that lead to different solutions. -ryanReceived on Sunday, 9 September 2007 21:50:53 UTC
This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:21 UTC