Re: HTML interpreter vs. HTML user agent from Maciej Stachowiak on 2009-05-29 (public-html@w3.org from May 2009)

From: Maciej Stachowiak <mjs@apple.com>
Date: Fri, 29 May 2009 15:00:43 -0700
To: Sam Ruby <rubys@intertwingly.net>
Cc: Ian Hickson <ian@hixie.ch>, HTML WG <public-html@w3.org>
Message-id: <AB4C05FD-EBA0-413A-9F26-3758D79E0E7B@apple.com>

On May 29, 2009, at 1:37 PM, Sam Ruby wrote:

> Ian Hickson wrote:
>> On Fri, 29 May 2009, Sam Ruby wrote:
>>> As to whether or not everybody will accept that the rules written  
>>> in HTML5 apply to everybody, I submit the following as a test case:
>>>
>>> http://status.aws.amazon.com/rss/EC2API.rss
>>>
>>> Heck, it is not clear to me that *browsers* will accept that that  
>>> particular rule applies to them.
>> Could you elaborate on how this doesn't work with the rules in  
>> HTML5? (Or rather, with the rules in Adam's ID?)
>
> That resource is served with a text/plain mime type, and therefore  
> should not be treated as a feed.
>
> At the present time, both IE8 and Firefox treat that document as a  
> feed.
> My assessment is that Firefox will continue to follow IE's lead in  
> this area, but I will gladly defer to those who actually work on the  
> product.
>
> Independent of how that is resolved, and despite the fact that the  
> HTML 5 documented approach has been adopted by the likes of  
> SimplePie, it is not my expectation that feedreaders will follow the  
> HTML5 spec's guidance on this and they get observably better (as in  
> less false negatives) results by ignoring the MIME type.  In  
> particular, Google Reader is an example of a feedreader which will  
> happily allow you to add this page as a subscription.

The sniffing rules forbid detecting a text/plain resource as text/html  
for security reasons - detecting a "safe" resource as one that  
contains embedded script could create security risks for some sites.  
But these security reasons don't apply to feeds as far as I can tell.

I think it's important to some extent that feed readers and browsers  
interoperate on this. If I see a link in a Web page or mail message to  
a feed, then clicking it should be able to open my configured feed  
reader of choice. Conversely, if I take a feed URL that works in my  
feed reader, and paste it into a Web page, then my users should be  
able to click that link and get a feed.

The feed reader de facto behavior of trying to parse anything (even  
video/quicktime or image/jpeg) as a feed is problematic in this  
regard. If you only test in a feed reader, then you can easily  
accidentally make feeds that can't be linked to from Web pages.  
Indeed, many feed readers these days have limited built-in Web  
browsing, so they might be unable to follow links to feeds that they  
could support if you entered the URL in their subscribe UI.

However, I can imagine that feed readers would be reluctant to drop  
support for text/plain feeds especially, since they are apparently  
fairly common, and there is no security benefit to outweigh the  
compatibility risk, as is the case with text/html. Thus, I think the  
right thing to do is to do as Adam suggests and allow sniffing of  
feeds (but not HTML) from text/plain. It would also be great to get  
direct input from authors of feed readers on what changes they would  
be willing to consider.

Regards,
Maciej

Received on Friday, 29 May 2009 22:01:22 UTC